FastMail.FM GUID upgrade process

2007-10-28 Thread Bron Gondwana
Ok - we're doing our GUID upgrades across the board now.  Here's
the process we're using:

a) wrote a tool that tied a "DB_File" called "cyrus.sha1s" in each
   meta directory on the replicas, parsed the index files looking
   for records with nothing but zeros in the last 16 characters of
   the GUID and calculated the sha1 on each of them.
   This took about 5 days to finish running, but makes the next
   part a lot quicker!

b) wrote a daemon which runs on each host and allows the following
   4 commands:

   *) LOCK user.name.URI%20Escaped.MailboxName
 - lock cyrus.header, cyrus.index, cyrus.expunge in that order
   using fcntl (our cyrus is build with it).
   Also if cyrus.sha1s is missing, attempt to fetch it from the
   replica (but it's OK if this fails, just means all old GUIDs
   will cause a re-calculation)
   *) UPGRADE
 - parse each of cyrus.index and cyrus.expunge.  If any old-style
   GUIDs are found, then looks first in cyrus.sha1s and finally
   just re-calculates from the underlying message files.
 - if any index records need new GUIDs or the old index has not
   yet been upgraded to version 10, stream the index file thorugh
   a Cyrus::IndexFile->stream_copy, altering the necessary GUIDs
   and forcing the output format to version 10 (this module can 
   also be used to downgrade if we ever need to!)
 - leave the new file in cyrus.$item.NEW, but mark internally
   that the file has been upgraded.
   *) ROLLBACK
 - if any file has been upgraded, unlink() the .NEW file.
 - unlock expunge, index, header (in that order)
   *) COMMIT
 - if any file has been upgraded, rename() the .NEW to the 
   base filename.
 - unlock expunge, index, header (in that order)

c) wrote a controller script which reads the mailbox listing from the
   master and opens connections to both the master and replica slotd,
   sending the following commands:

   1) master LOCK mailbox (or die)
   2) replica LOCK mailbox (or master ROLLBACK; die)
   3) master UPGRADE (or replica ROLLBACK; master ROLLBACK; die)
   4) replica UPGRADE (or replica ROLLBACK; master ROLLBACK; die)
   5) replica COMMIT (or replica ROLLBACK; master ROLLBACK; die)
   6) master COMMIT (or master ROLLBACK; die NOISILY!!!)

   The only danger point is (6), where you could wind up with an
   upgraded replica without the associated upgraded master.  You
   can go ahead and fix them by hand though, assuming you read the
   NOISILY bit.

d) I think the slashdot crowd would put "Profit!!!" here.  With
   only a short lock time on each index (most sha1s precalculated)
   and no need to multi-rewrite any index file, this will run much
   faster than the alternatives.  I guess I should go clean up the
   cyrus.sha1s files once it's all finished.


Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus IMAPd 2.3.10 Released

2007-11-05 Thread Bron Gondwana
On Sun, Nov 04, 2007 at 07:19:26PM -0800, Rich Wales wrote:
> What is the current status of 2.3.10?  Right after it was announced
> a couple of weeks ago, I saw some people reporting problems.  Are
> there any patches?  Or is 2.3.10 still believed to be OK as is?
> 
> I'm running 2.3.9 on a FreeBSD 6.2 master and an Ubuntu 7.10 replica
> server setup, and I want to upgrade to 2.3.10 in hopes of getting
> rid of some problems with the sync code intermittently crashing, but
> this is a production system, and I don't feel comfortable upgrading
> to 2.3.10 as long as there are unresolved serious bug reports.

FastMail is running 2.3.10 on all our production systems now, and
there are no regressions that I'm aware of.  There are still some
bugs that also existed in 2.3.9, but we aren't patching against
any "bugs" rather than "things that don't work how we would prefer".

The two big "new" bugs that we found in the 2.3.10pre3 codebase when
we rolled that out into production had their fixes accepted back
upstream before the final 2.3.10 was cut.

That said, we're seeing slightly more skiplist corruption than
previously and have not yet determined the cause.  We're backing
up a text-dump of our mailboxes.db files once per hour just in
case - but it's highly probable that the issues are due to
something odd we're doing with long running cyr_dbtool processes.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Deleting top-level mailbox with 'delete_mode: delayed'

2007-11-05 Thread Bron Gondwana
On Fri, Nov 02, 2007 at 01:15:37PM -0400, Brian Wong wrote:
> On Nov 2, 2007 12:39 PM, Rudy Gevaert <[EMAIL PROTECTED]> wrote:
> >
> > Brian Wong wrote:
> > > I was testing out Cyrus 2.3.10 and realized that when I set the option
> > >
> > > delete_mode: delayed
> > >
> > > I can not delete top-level mailboxes.
> > >
> > > localhost.localdomain> lm
> > > localhost.localdomain> cm user.bwong
> > > localhost.localdomain> sam user.bwong  c
> > > localhost.localdomain> dm user.bwong
> > > deletemailbox: Operation is not supported on mailbox
> > > localhost.localdomain> lm
> > > user.bwong (\HasNoChildren)
> > >
> > > Disabling the delayed delete gives expected results. The mailbox is
> > > deleted as normal. Anyone else confirm this?
> >
> > I'm just back from holiday (and only catching up on mail).  I always set
> > the 'x' permission.  Could you try that?  If that doesn't work, I'll try
> > to delete a top level mailbox on Monday (I'm running 2.3.10 in test).
> >
> > Rudy
> >
> 
> localhost.localdomain> lam user.bwong
> bwong lrswipkxtecda
> admin kxc
> localhost.localdomain> dm user.bwong
> deletemailbox: Operation is not supported on mailbox
> 
> I think if I did not have the right to delete the mailbox, I would get
> a "Permission Denied" instead of the error I am receiving. Let me know
> what you find when you try it. I feel that if this is really a bug it
> would have been caught before release, but then again I can't think of
> anything atypical with my setup that would cause this problem.

It's almost certainly caused by the code that checks if you're renaming
a "top level mailbox" for a user and special cases it in all sorts of
ways.  I never liked that code much!

My solution was to make DELETED.user.bwong.46A12345 (or similar) also
be considered to be an "INBOX" so it was treated as a user rename.
This seems not to be working in your environment, and I'm really not
sure why.

I don't see anything specially different in our config.
fast_rename: yes, but that won't work for you anyway because it's
using a not-yet-perfect patch at our end.

All our mailbox deletes are done as the admin user.  It won't work
if you're not a bona-fide admin (not just a user called admin who
happens to have an ACL).  Check the 'admins:' parameter in your
imapd.conf.

Regards,
 
Bron.

(P.S. your username is scarily similar to mine!)

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus IMAPd 2.3.10 Released

2007-11-06 Thread Bron Gondwana
On Tue, Nov 06, 2007 at 08:57:29AM +0100, Rudy Gevaert wrote:
> Bron Gondwana wrote:
>> On Sun, Nov 04, 2007 at 07:19:26PM -0800, Rich Wales wrote:
>>> What is the current status of 2.3.10?  Right after it was announced
>>> a couple of weeks ago, I saw some people reporting problems.  Are
>>> there any patches?  Or is 2.3.10 still believed to be OK as is?
>>>
>>> I'm running 2.3.9 on a FreeBSD 6.2 master and an Ubuntu 7.10 replica
>>> server setup, and I want to upgrade to 2.3.10 in hopes of getting
>>> rid of some problems with the sync code intermittently crashing, but
>>> this is a production system, and I don't feel comfortable upgrading
>>> to 2.3.10 as long as there are unresolved serious bug reports.
>> FastMail is running 2.3.10 on all our production systems now, and
>> there are no regressions that I'm aware of.  There are still some
>> bugs that also existed in 2.3.9, but we aren't patching against
>> any "bugs" rather than "things that don't work how we would prefer".
>
> I have upgraded three (of seven) mailstores yesterday.  Upgrade went 
> without problems.  Regeneration of the guid takes a long time.  It's not 
> finished yet, and its been running for 12 hours.

That's why I pre-calculated all the sha1s I could and wrote my own index
upgrader :)

Yeah, it takes a while!

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: LARGE single-system Cyrus installs?

2007-11-08 Thread Bron Gondwana
On Thu, Nov 08, 2007 at 10:18:04AM -0800, Vincent Fox wrote:
> Our latest line of investigation goes back to the Fastmail suggestion, 
> simply
> have multiple Cyrus binary instances on a system.  Each running it's own
> config and with it's own ZFS filesystems out of the pool to use.
> Since we can bring up a virtual interface for each instance we won't even
> have to bother with using separate port numbers, etc.

Also virtual interfaces means you can move an instance without having
to tell anyone else about it (but it sounds like you're going with an
"all eggs in one basket" approach anyway)

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: LARGE single-system Cyrus installs?

2007-11-10 Thread Bron Gondwana
On Fri, Nov 09, 2007 at 01:28:05PM -0500, John Madden wrote:
> On Fri, 2007-11-09 at 19:10 +0100, Jure Pečar wrote:
> > I'm still on linux and was thinking a lot about trying out solaris 10,
> > but
> > stories like yours will make me think again about that ...
> 
> Agreed -- with the things I see from the Solaris (and zfs) and Sparc
> hardware in general, my money's still on Linux/LVM/Reiser/ext3.
> 
> 250,000 mailboxes, 1,000 concurrent users, 60 million emails, 500k
> deliveries/day.  For us, backups are the worst thing, followed by
> reiserfs's use of BLK, followed by the need to use a ton of disks to
> keep up with the i/o.

For us backups are hardly a blip on the radar :)  The joy of writing
your own custom backup system that knows more about Cyrus internals
than just about anything else.  It starts with some stat calls, and
if any of the cyrus.header, cyrus.index or cyrus.expunge files have
changed then it will lock them all then stream them all to the backup
server.

The backup server then parses them and decides (based on GUID) if
there are any data files it hasn't yet fetched.  If so, it fetches
them and checks the sha1 of the fetched file against the GUID.

The whole thing takes a couple of seconds per user and requires
less IO than even using direct IMAP calls would.

Now our big IO user is cyr_expire.  We run it once per week, and
it's a killer.  I'd be tempted to run it a lot more frequently if
it didn't have such a high baseline IO cost on top of the actual
message unlinks (though the unlinks are the real killer)

Bron ( and the BKL, *sigh*.  I just installed an external RAID
   unit with 8x1TB drives in it for data.  That 6GB/300Mb == 20
   data partitions plus 20 meta partitions to go with it.  That's
   a lot of BKL! )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: How many people to admin a Cyrus system?

2007-11-10 Thread Bron Gondwana
On Fri, Nov 09, 2007 at 08:24:00AM -0500, Adam Tauno Williams wrote:
> > > How many
> > > and what sort of people does it take to maintain a system such as
> > > this?  I need a good argument for hiring a replacement for me.
> > At a minimum you want 1 qualified person and someone cross-trained
> > as a backup, so that person can reasonably enough have vacations.
> > Any decent sysadmin should be able to MAINTAIN such a service
> > I don't think actually programming skills should be primary.  
> 
> Agree.  I maintain a Cyrus system.  And on most days that doesn't even
> involve touching it.  Any reasonably proficient person with UNIX skills
> should be able to take over Cyrus administration given they are willing
> to do some reading.
 
I maintain a Cyrus system and it's taken over my life!  Yikes.

Summary for this week:

* crc32 for indexes hummning along in the background.
* getting more skiplist corruptions - going to have to post about this.
  We lost 4 mailboxes.db files this week, 3 during controlled failover,
  one during the night when nobody was working on it.  Suspect new ACL
  feature on the web interface which allows more frequent updates is
  causing issues.
* the one buggy skiplist I actually still have a copy of, the "logstart"
  value in the header is wrong, causing recovery to fail with only a few
  of the records still reachable because it hits another INORDER record
  rather than the ADD record and drops out.  I've got the monitoring
  system set up to let me know if it finds any other skiplist errors and
  take a copy of the offending file.

> > I have been
> > doing sysadmin work since 1989 and the actual programming work I've
> > done in that time has been maybe 2% of it.  If you have a lot of custom
> > interface stuff to your campus systems maybe you need more programmer
> > skills.   As a completely inappropriate generalization, former engineers
> > and mathematicians also make good sysadmins because they have the mindset
> > and the skills for problem decomposition and trouble-shooting.
> 
> Yep.

Agree there.  Sysadmin has always been a fraction of my work because I
tend to do a lot of "glue" programming to abstract away anything that's
sysadmin work.  My first really major project (after converting us from
CVS to Subversion) was making all the servers install automatically from
PXE boot and the configurations all set themselves up with "make
install" from the Subversion repository, so that most everyday sysadmin
is now automated - just update the master config file, roll it out,
restart affected services.

So day to day we need less than one sysadmin, but of course incident
response is unpredicatable, and having two good sysadmins (Rob and I
share sysadmin responsibility here) available is very handy.  Both
for the "you can let one of them have a holiday" point of view and
the "two heads are better than one" ability to work past the other's
mental blocks and avoid getting stuck in a rut trying to solve 
problems.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Skiplist stuff

2007-11-11 Thread Bron Gondwana
On Sun, Nov 11, 2007 at 02:31:45PM +1100, Bron Gondwana wrote:
> * the one buggy skiplist I actually still have a copy of, the "logstart"
>   value in the header is wrong, causing recovery to fail with only a few
>   of the records still reachable because it hits another INORDER record
>   rather than the ADD record and drops out.  I've got the monitoring
>   system set up to let me know if it finds any other skiplist errors and
>   take a copy of the offending file.

Ok, so attached is the patch that deals with this one particular
case and adds a bit more debugging as well.  It will log an ERROR
if it sees the case that would previously cause a total bailout.

NOTE - there's still heaps of "add this value read from file to
this mmap base pointer and dereference with impunity" scattered
through this code, which is the sort of accident waiting to happen
I've been trying to remove from everything else along with the
CRC32 stuff I keep promising.

I also wrote a much bigger patch which you don't get to see yet
because someone might be silly enough to try and use it.  It does
some variable renaming, refactoring, etc.

At least there's some signed vs unsigned fuzziness I'm still not
happy with here, and a naming policy that made it clearer when
a variable contained network-byte-order and when it contained
host-byte-order would be peachy too.  My big patch does that.
Sadly it also infinite loops during recovery, which suggests I
did something pretty stupid in it somewhere!  Too late to debug
that tonight.


Anyway - here it is.  A "recovery()" that copes if the logstart
parameter in the database header is wrong.  No, I don't have a
clue how that happened unless lseek() lied.  Maybe it sometimes
lies, I don't know.  I'll be writing a test case for that soon
too!

Bron.
Index: cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c
===
--- cyrus-imapd-2.3.10.orig/lib/cyrusdb_skiplist.c	2007-11-11 07:44:25.0 -0500
+++ cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c	2007-11-11 07:59:47.0 -0500
@@ -1206,7 +1206,7 @@
 lseek(tp->syncfd, tp->logend, SEEK_SET);
 r = retry_writev(tp->syncfd, iov, num_iov);
 if (r < 0) {
-	syslog(LOG_ERR, "DBERROR: retry_writev(): %m");
+	syslog(LOG_ERR, "DBERROR: retry_writev() %s: %m", db->fname);
 	myabort(db, tp);
 	return CYRUSDB_IOERROR;
 }
@@ -1926,20 +1926,13 @@
 
 /* reset the data that was written INORDER by the last checkpoint */
 offset = DUMMY_OFFSET(db) + DUMMY_SIZE(db);
-while (!r && (offset < (bit32) db->logstart)) {
+while (!r && (offset < db->map_size)
+  && (TYPE(db->map_base+offset) == INORDER)) {
 	ptr = db->map_base + offset;
 	offsetnet = htonl(offset);
 
 	db->listsize++;
 
-	/* make sure this is INORDER */
-	if (TYPE(ptr) != INORDER) {
-	syslog(LOG_ERR, "DBERROR: skiplist recovery: %04X should be INORDER",
-		   offset);
-	r = CYRUSDB_IOERROR;
-	continue;
-	}
-	
 	/* xxx check \0 fill on key */
 
 	/* xxx check \0 fill on data */
@@ -1980,6 +1973,11 @@
 	}
 }
 
+if (offset != db->logstart) {
+	syslog(LOG_ERR, "DBERROR: recovery logstart %s: %04X not %04X",
+	   db->fname, offset, db->logstart);
+}
+
 /* zero out the remaining pointers */
 if (!r) {
 	for (i = 0; !r && i < db->maxlevel; i++) {

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: Replication: does it work in both directions?

2007-11-11 Thread Bron Gondwana
On Sat, Nov 10, 2007 at 07:51:15PM -0800, Rich Wales wrote:
> I'm using replication on a 2.3.9 system.
> 
> I know that if changes happen on the master system, they are propagated
> automatically to the replica system.
> 
> But what happens if I make a change on the replica (e.g., by setting up
> an account to access its mail through the replica's IMAP server)?  I
> tried this just now, and the change is NOT propagating from the replica
> to the master.
> 
> What do I need to do in order for changes made on the replica to get
> copied over to the master?  Or is this simply impossible?

Impossible.  You don't do this.

What you can do (the simple case of what we do) is set up two Cyrus
instances on each machine, replicating to each other, and set up user
accounts on one or the other, so you can get full use of both machines.

Regards,

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Replication: sync_client -r dies

2007-11-11 Thread Bron Gondwana
On Sat, Nov 10, 2007 at 07:09:53PM -0800, Rich Wales wrote:
> After about a week of having synchronization running perfectly in my
> 2.3.9 system, I finally got another bailout incident with sync_client
> on my master server.
> 
> This happened just after I shut down my replica server (to move it to
> a different location).  About two minutes after the replica went down,
> sync_client on the master said "Error in do_sync(): bailing out!" with
> no other messages of any kind.
> 
> It seems to me that the replication code ought to be a bit more robust
> than this when a replica goes down or loses network connectivity.  Is
> the 2.3.10 code any better than 2.3.9 in the way this kind of situation
> is handled?

I believe David Carter has been working on some stuff for this which is
lined up to go in soon.

We just have a monitor_sync script that runs every 10 minutes from cron
and can recover from this and a variety of other interesting situations.

Yeah - it would be nice to have a way to tell the master "going down
now, be back later".

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Multiple skiplist bugs found, patches attached

2007-11-12 Thread Bron Gondwana
On Mon, Nov 12, 2007 at 12:34:34AM +1100, Bron Gondwana wrote:
> Anyway - here it is.  A "recovery()" that copes if the logstart
> parameter in the database header is wrong.  No, I don't have a
> clue how that happened unless lseek() lied.  Maybe it sometimes
> lies, I don't know.  I'll be writing a test case for that soon
> too!

I have some more suspicions now, but I wrote it all up in the
patch header, so here's the bugfixes only patch, a "robustness"
extras patch and the tool I used for testing.

Ken, I know you've done some other work on the file changing
types.  I'd like to be even more agressive and convert just
about everything to bit32 and also rename some variables, but
I restricted myself in this to only fixing the most ugly case:
offset = htonl(offset).

These patches are all against 2.3.10 (in this order), and may
need some fuzz fixing to apply against your latest CVS thanks
to those changes - sorry I haven't done that, but it's getting
on 1am for me, and I've just finished doing a lot of testing
and paring these down to simple and clear patches that don't
touch more than they need to fix the issues.

cyrus-skiplist-bugfixes-2.3.10.diff:

  Ken, please review this patch and consider it for pushing out
  in the next release, preferably soon.  There really are a lot
  of issues I found reviewing the code, and even more so with
  the attached tool that can be used to "hammer" a file with
  all sorts of interesting requests.  There are some really
  nasty skiplist corruptions available if even one process
  is ever killed or segfaults half way through a transaction
  and the first operation to touch said file after this is
  a write.

cyrus-skiplist-robustify-2.3.10.diff:

  If you have been running skiplist on your systems and suspect
  that you may have corrupted databases, you probably want this.
  It adds extra robustness and fixupability to recovery().  It's
  still not going to be crash proof reading line-noise, but it
  detects and fixes all the corruptions I have personally seen.

  (I say this having tested it on all the bogus DBs I had, eg:)

  DBERROR: skiplist recovery /tmp/mb2/mailboxes.db.1194508562: 
  -> 8E6DD8 should be ADD or DELETE

  became:

  skiplist recovery /tmp/mb2/mailboxes.db.1194508562: 
  ->  skipped 136 bytes of zeros at 8E6DD8
  skiplist: recovered /tmp/mb2/mailboxes.db.1194508562 
  ->  (44594 records, 9547108 bytes) in 1 second
  skiplist: checkpointed /tmp/mb2/mailboxes.db.1194508562
  ->  (44594 records, 5350556 bytes) in 1 second

  and:

  skiplist recovery /tmp/mailboxes.db.fail: 
  -> 288C00 should be ADD or DELETE

  became:

  skiplist recovery /tmp/mailboxes.db.fail: 
  -> incorrect logstart 288BD8 changed to 356F94
  skiplist: recovered /tmp/mailboxes.db.fail
  -> (28811 records, 4109664 bytes) in 1 second

  In both cases a second run of the same command (I used
  cyr_dbtool 'show') came up clean - no issues remaining
  and no log entries.


cyrus-hammer-skiplist-2.3.10.diff:

  I used this command to hammer a skiplist database like so:

  sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db &
  sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db &
  sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db &
  sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db &
  sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db &
  sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db &
  sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db &
  ...

  I've turned down the "forget about this transaction" option
  to a lot less common than my original tests.  It should still
  fire a couple of times per hammer, but it creates log entries 
  even on the fully patched code (rolling back incomplete txn),
  so I didn't want to spam the logs.

Enjoy,

Bron.
SKIPLIST bugfixes

In the past we have had issues with bugs in skiplist on seen
files, and we truncated files at the offset with the issue
since they were only seen data.

Lately, we have had more tools updating mailboxes.db more
often, and have lost multiple mailboxes.db files.

There are two detectable issues:

1) incorrect header "logstart" values causing recovery to
   fail with either unexpected ADD/DELETE records or
   unexpected INORDER records depending which side of the
   correct location the logstart value is wrong.
2) a bunch of zero bytes between transactions in the log
   section.

The attached patch fixes the following issues:

a) recovery failed to update db->map_base if it truncated
   a partial transaction.  This reliably recreated the
   zero bytes issues above by having the next store command
   lseek to a location past the new end of the file, and
   hence fill the remainder with blanks.

b) the logic in the "delete" handler for detecting "no
   record exists" (ptr == db->map_base) was backwards,
   meaning t

Re: Replication: does it work in both directions?

2007-11-12 Thread Bron Gondwana
On Sun, Nov 11, 2007 at 08:41:04PM -0800, Rich Wales wrote:
> Earlier, I wrote:
> 
> >> What do I need to do in order for changes made on the replica
> >> to get copied over to the master?
> 
> Bron Gondwana replied:
> 
> > Impossible.  You don't do this.  What you can do (the simple
> > case of what we do) is set up two Cyrus instances on each
> > machine, replicating to each other, and set up user accounts
> > on one or the other, so you can get full use of both machines.
> 
> I note that sync_client can take a list of mailboxes on the command
> line.  Does this define (and limit) the set of mailboxes that are
> replicated?  If a mailbox is listed in the command line, are sub-
> mailboxes replicated too?

It doesn't work like that.  Rolling replication gets events from
actions on mailboxes (lmtp deliver, imapd updates, etc) and logs
them - then the sync_client process running in the background
reads that log file and uses the actions to know what things to
check and sync with the sync_server on your replica.

> My environment (family network) only has half a dozen users, and the
> set of users changes only rarely.  Suppose I do the following:
> 
> (1) I divide my users into two groups -- each group assigned to one
> of my two Cyrus servers as the master for those users.
> 
> (2) The sync_client line in cyrus.conf for each server lists the
> mailboxes for the users assigned to that server as master.  Each
> user is listed in the sync_client command line of only one server.
> 
> (3) Each server is configured (via the sync_... lines in imapd.conf)
> to sync to the other server.
> 
> (4) Both servers would be running sync_server.
> 
> So, I would have replication set up going both directions between my
> two servers, but the sets of users handled in each direction would be
> disjoint.  Each user would be assigned to one IMAP server (the master
> for their mailbox collection), and the other server would be their
> replica and act as their backup.
> 
> Would this work?

You are evil.  While I can't see any particular reason why it wouldn't,
I'm still scared.  I wouldn't be game to mess with that.  You'd really
REALLY want to be sure that your email delivery and IMAP connections
only happened to the approved master for each user or you'd get a bad
case of split brain.
 
> Remember, again, that I'm talking about a small installation.  Clearly,
> a scheme requiring every user's mailbox to be explicitly listed in one
> or the other server's sync_client line is not going to scale to a large
> setup with hundreds or thousands of users; I understand this.
> 
> If this idea of doing two-way partial replication with a single Cyrus
> instance on each server will in fact work, should I use the same value
> for sync_machineid on both servers?  Or should they be different?

If you use use 2.3.10 then it really doesn't matter at all.  It's a
relic of worser UUIDs (now sha1 based GUIDs) that nobody wants to
talk about ever again.  That said, the code probably still requires
you set it, and I'd set them differently just for the "why not"
factor.  Maybe something deep in the code still cares and I can't
be arsed checking right now, I've been reading the skiplist code
for days, and I'm sure it will give me nightmares when I calm down
enough to sleep!

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Renaming heirarchies from the bottom

2007-11-12 Thread Bron Gondwana
On Mon, Nov 12, 2007 at 11:38:54AM +, Ian G Batten wrote:
> A common scenario when we are moving users between partitions is to  
> want to move their archived mail first, because there's less risk of  
> disrupting them, and then their top-level mailbox last.

For a different approach - we use sync_client and some hacked
together config directory to replicate all their email as close
as possible to up-to-date, then lock down all incoming connections
and run it again.  Finally, we check that all the data got there
OK (we have a tool for comparing two copies via IMAP) and then
update the delivery paths and proxies and up them come again.

Any sort of rename where they get presented with their folders
slowly disappearing sounds fraught with confusion to me!

(renaming of INBOX is special in lots of ways)

That said, no comments on the general merits of your approach.
The important thing is to check if the data transferred
correctly :)


Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Replication: does it work in both directions?

2007-11-12 Thread Bron Gondwana

On Mon, 12 Nov 2007 09:37:12 -0800, "Rich Wales" <[EMAIL PROTECTED]> said:
> Bron Gondwana wrote:
> 
> > It doesn't work like that.  Rolling replication gets events from
> > actions on mailboxes (lmtp deliver, imapd updates, etc) and logs
> > them - then the sync_client process running in the background
> > reads that log file and uses the actions to know what things to
> > check and sync with the sync_server on your replica.
> 
> OK, that's the answer I needed to hear.  If a mailbox list on the
> command line is not compatible with rolling replication, then I'll
> simply not have a choice but to set up two Cyrus instances if I
> want to spread my users across different IMAP servers.

It works fine as a one-off, but not for rolling, because rolling reads
the log.

That said, only users who have had any actions on that server will
create log entries.

I am quite tempted to test your theoretical layout at some point, but
right now I'm heartily sick of playing with Cyrus and am going to take
a break and do something totally different.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Multiple skiplist bugs found, patches attached

2007-11-13 Thread Bron Gondwana

On Tue, 13 Nov 2007 09:12:18 +0100 (CET), "Simon Matter" <[EMAIL PROTECTED]> 
said:
> > On Mon, Nov 12, 2007 at 12:34:34AM +1100, Bron Gondwana wrote:
> >> Anyway - here it is.  A "recovery()" that copes if the logstart
> >> parameter in the database header is wrong.  No, I don't have a
> >> clue how that happened unless lseek() lied.  Maybe it sometimes
> >> lies, I don't know.  I'll be writing a test case for that soon
> >> too!
> >
> > I have some more suspicions now, but I wrote it all up in the
> > patch header, so here's the bugfixes only patch, a "robustness"
> > extras patch and the tool I used for testing.
> >
> > Ken, I know you've done some other work on the file changing
> > types.  I'd like to be even more agressive and convert just
> > about everything to bit32 and also rename some variables, but
> > I restricted myself in this to only fixing the most ugly case:
> > offset = htonl(offset).
> >
> > These patches are all against 2.3.10 (in this order), and may
> > need some fuzz fixing to apply against your latest CVS thanks
> > to those changes - sorry I haven't done that, but it's getting
> > on 1am for me, and I've just finished doing a lot of testing
> > and paring these down to simple and clear patches that don't
> > touch more than they need to fix the issues.
> >
> 
> > cyrus-skiplist-bugfixes-2.3.10.diff:
> >
> 
> > cyrus-skiplist-robustify-2.3.10.diff:
> >
> 
> Hi Bron,
> 
> I didn't have much troubles with skiplist over the years and it has been
> a
> blessing since moving away from BDB. But I did have a few issues with
> broken skiplist files so your patches are very welcome. I have included
> the patches in my private rpm packages to try how they work. Do you
> recommend both for general consumption?

They've been running for 24 hours on all our production systems with
no ill effects :)

Seriously - yes, I do.  They are quite short, and they're the culmination
of about 3 days of pretty heavy work over the weekend and Monday after we
lost a mailboxes.db on our busiest store to one of these bugs (my wife and
kids were getting ready to kill me for neglecting them towards the end,
I'm sure!)  I build multiple different patches and tested them over that
time.  I also wrote a Perl module that can read skiplist files natively
and tested some things with that as well.

These couple of patches I have posted are the best bits of those distilled
down into the simplest and clearest small set of changes.  They've been hit
pretty hard with the hammer scripts.


I've also got another patch which I'll attach here that I wrote today which
re-tunes the "how often to checkpoint" calculation.  I want our mailboxes.db
files especially to checkpoint more frequently, as that will make them
less "seeky" - which will help with cachelines at least.  We have enough
memory (and always plenty free) that I'm sure every page is hot in cache
within a few minutes.

The seekyness is mainly an issue with clients doing "LIST", which our web
interface does at login, so we want it to be as quick as possible.

As for seen files - well, they tend to be small and frequently updated,
so they'll just checkpoint about 4 times as often now.  Will save a tiny
bit of disk space but more interestingly reduce the memory footprint to
keep them all in cache.


Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]

Skiplist tuning

With random changes to a mailboxes.db file, it could be nearly 100%
random seeks before it recompressed.

A seen file would need to reach 16kb before even considering
re-compressing, with a real data length of just a couple of hundred
bytes.

This patch reduces the limits to:

4kb overhead
120% rather than 200% of current "sorted" size.
Index: cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c
===
--- cyrus-imapd-2.3.10.orig/lib/cyrusdb_skiplist.c	2007-11-12 23:53:34.0 -0500
+++ cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c	2007-11-12 23:57:38.0 -0500
@@ -302,7 +302,7 @@
 SKIPLIST_VERSION = 1,
 SKIPLIST_VERSION_MINOR = 2,
 SKIPLIST_MAXLEVEL = 20,
-SKIPLIST_MINREWRITE = 16834 /* don't rewrite logs smaller than this */
+SKIPLIST_MINREWRITE = 4096 /* don't rewrite logs smaller than this */
 };
 
 #define BIT32_MAX 4294967295U
@@ -1392,8 +1392,8 @@
 }
 
  done:
-/* consider checkpointing */
-if (!r && tid->logend > (2 * db->logstart + SKIPLIST_MINREWRITE)) {
+/* consider checkpointing (journal is 20% of data length) */
+if (!r && tid->logend > (12 * db->logstart / 10 + SKIPLIST_MINREWRITE)) {
 	r = mycheckpoint(db, 1);
 }
 

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: Deleting top-level mailbox with 'delete_mode: delayed'

2007-11-13 Thread Bron Gondwana
On Tue, Nov 13, 2007 at 01:11:49PM +0100, Simon Matter wrote:
> > expunge_mode: delayed
> > delete_mode: delayed
> 
> I've just tried a batch delete of mailboxes and hit the same wall.
> 
> Mailbox deletion doesn't work anymore with 2.3.10 if "delete_mode:
> delayed". If "delete_mode: immediate" it works, but with delayed I get
> "deletemailbox: Operation is not supported on mailbox".
> 
> Did I miss something? Does anybody have a patch?

I have "delete_mode: immediate" on the replica and 
"delete_mode: delayed" on the master.  It doesn't make any sense
for the replica to do a delayed delete, as the master is already
generating a "RENAME" event (well, two MAILBOX events actually,
let's not get picky) with the old and new names for the mailboxes.

The replica will be doing a rename rather than a delete in the
most frequent case anyway.  If you're unlucky and the two
MAILBOXES calls get split up (probably some other event on the
first mailbox from earlier being run after you've done the 
rename - don't you love concurrency) then it will issue a
DELETE on the replica.  If that causes a rename instead you
will wind up with _TWO_ deleted folders with very similar
names, one containing all the messages that were still on
the replica and one containing all the messages in the
copy on the master.

Better just to re-copy those ones from the master in the
unlucky case if you ask me.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: LARGE single-system Cyrus installs?

2007-11-13 Thread Bron Gondwana
On Tue, Nov 13, 2007 at 10:24:22AM +, David Carter wrote:
> On Sun, 11 Nov 2007, Bron Gondwana wrote:
> 
> >> 250,000 mailboxes, 1,000 concurrent users, 60 million emails, 500k 
> >> deliveries/day.  For us, backups are the worst thing, followed by 
> >> reiserfs's use of BLK, followed by the need to use a ton of disks to 
> >> keep up with the i/o.
> >
> > For us backups are hardly a blip on the radar :)  The joy of writing 
> > your own custom backup system that knows more about Cyrus internals than 
> > just about anything else.  It starts with some stat calls, and if any of 
> > the cyrus.header, cyrus.index or cyrus.expunge files have changed then 
> > it will lock them all then stream them all to the backup server.
> 
> Cyrus is pretty ideal for fast incremental updates to a backup system: 
> hence replication. You shouldn't need to lock anything with delayed 
> expunge, delayed delete and fast rename in place.

If you're planning to lift a consistent copy of a .index file, you need
to lock it for the duration of reading it (read lock at least).

Yeah - replication is one way to do it.  We happen to read from the
masters at the moment, but it would be pretty trivial to switch to
using the replicas (change a $Store->MasterSlot() to
$Store->ReplicaSlot() at one place in the code in fact) if we wanted to.

But since I would like a consistent snapshot of the mailbox state,
I lock the cyrus.header and then the cyrus.index and then (if it's
there) the cyrus.expunge.  That means no sneaky process could (for
example) delete the mailbox and create another one with the same
name while I was busy downloading the last file - giving me totally
bogus data.  This is particularly important because I store things
by mailbox uniqueid rather than imap path (with pointers from the
imap path of course) so that a folder rename turns into a symlink
delete (well, replacement with one having an empty target anyway) 
and a symlink create in the tar file.

Bron ( and right now I'm running the process to finish the upgrade
   from MD5 based to SHA1 based internal identifiers in the
   backup system, since all our indexes are upgraded )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: v2.3.10 build fails, pcreposix problems

2007-11-14 Thread Bron Gondwana
On Wed, Nov 14, 2007 at 12:16:29PM -0500, Larry Rosenbaum wrote:
> Upgrading PCRE to v7.4 fixed the problem.  (Previous version was v6.5)

Excellent - because I was a little surprised otherwise - then again I
only tested on Debian Etch because that's all I have handy.

Ken - I think the solution is to put:

#include 

directly before

#include 

in sieve/comparator.h

I tested that that works fine on my systems, and looks like it will
also work on systems with older PCRE that don't do the include
themselves.

RE: your other question - I guess it would be reasonably easy to add
a --disable-pcre to configure so that it never gets tested for or
included, even if it is installed.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: LARGE single-system Cyrus installs?

2007-11-15 Thread Bron Gondwana
On Thu, Nov 15, 2007 at 01:29:54PM -0500, Wesley Craig wrote:
> On 14 Nov 2007, at 23:15, Vincent Fox wrote:
> > We have all Cyrus lumped in one ZFS pool, with separate filesystems  
> > for
> > imap, mail, sieve, etc.  However, I do have an unused disk in each  
> > array
> > such that I could setup a simple ZFS mirror pair for /var/cyrus/ 
> > imap so
> > that the databases are in their own pools.  Or even I suppose a UFS
> > filesystem with directio and all that jazz set.
> 
> About 30% of all I/O is to mailboxes.db, most of which is read.  I  
> haven't personally deployed a split-meta configuration, but I  
> understand the meta files are similarly heavy I/O concentrators.

Which is a good argument for checkpointing it (gah, hate that term -
it's so non-specific.  I've spent some time working on terminology
maps for this stuff, and "repack" is the current winner, mainly due
to be shorter than the runner up "consolidate")

What was I saying again?  Oh - yes.  Current skiplist metric is that
the mailboxes.db has to be be twice the size of the last checkpointed
size plus 16k before it re-checkpoints.  Given that a checkpoint takes
approximately 2 seconds on our systems, and it means that you're not
seeking all over the place any more, it would almost certainly be a
win.

That said, we don't have a single machine where the memory pressure
is tight enough to ever push mailboxes.db out of the cache, so it's
not ever going to be hitting the disk for reads anyway!

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: LARGE single-system Cyrus installs?

2007-11-19 Thread Bron Gondwana
On Mon, Nov 19, 2007 at 08:50:16AM +, Ian G Batten wrote:
>
> On 17 Nov 07, at 0909, Rob Mueller wrote:
>>
>> This shouldn't really be a problem. Yes the whole file is locked for the
>> duration of the write, however there should be only 1 fsync per
>> "transaction", which is what would introduce any latency. The actual 
>> writes
>> to the db file itself should be basically instant as the OS should just
>> cache them.
>
> One thing that's worth noting for ZFS-ites is that on ZFS, you can have 
> multiple writer threads in a file simultaneously, which UFS can only do for 
> directio under certain conditions I can't recall.  That's a win for 
> overlapping transactions into a file-based database.   We're not hitting 
> mailboxes.db remotely rapidly enough for this to be an issue, but I can 
> imagine it being so for big shops.
>
> In production releases of ZFS fsync() essentially triggers sync() (fixed in 
> Solaris Next).  So if you anticipate a lot of writes (and hence fsync()s) 
> to mailboxes.db then you don't want mailboxes.db in the same ZFS filesystem 
> as things with lots of un-sync'd writes going on.I've broken up 
> /var/imap for ease of taking and rolling back snapshots, but it has the 
> handy side-effect of isolating delivery.db and mailboxes.db from all the 
> metadata partitions.

Skiplist requires two fsync calls per transaction (single
untransactioned actions are also one transaction), and it
also locks the entire file for the duration of said 
transaction, so you can't have two writes happening at
once.  I haven't built Cyrus on our Solaris box, so I don't
know if it uses fcntl there, it certainly does on the Linux
systems, but it can fall back to flock if fcntl isn't
available.

> In my darker moments, by the way, I'm tempted to put deliver.db into tmpfs. 
>  For planned reboot I could copy it somewhere stable, and I could 
> periodically dump it out to disk.  But if I lost it, the consequences 
> aren't serious, and it's most of the write load through that particular 
> filesystem.

Sounds pretty reasonable to me.

>>
>> Still, you have a point that mailboxes.db is a global point of contention,
>> and it is access a lot, so blocking all processes on it for a write could 
>> be
>> an issue.
>
>
>
>>
>> Which makes me even more glad that we've split up our servers into lots of
>> small cyrus instances, even less points of contention...

Yeah, it's nice.  It's a pain that the entire mailboxes.db blocks
on writes, but it sure keeps the skiplist format simple.  I'd be
interested to see if there are cases where a transaction is kept
open longer than it needs to be though.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: LARGE single-system Cyrus installs?

2007-11-19 Thread Bron Gondwana

On Tue, 20 Nov 2007 15:40:58 +1100, "Andrew McNamara" <[EMAIL PROTECTED]> said:
> >> In production releases of ZFS fsync() essentially triggers sync() (fixed 
> >> in 
> >> Solaris Next).  
> [...]
> >Skiplist requires two fsync calls per transaction (single
> >untransactioned actions are also one transaction), and it
> >also locks the entire file for the duration of said 
> >transaction, so you can't have two writes happening at
> >once.  I haven't built Cyrus on our Solaris box, so I don't
> >know if it uses fcntl there, it certainly does on the Linux
> >systems, but it can fall back to flock if fcntl isn't
> >available.
> 
> Note that ext3 effectively does the same thing as ZFS on fsync() -
> because
> the journal layer is block based and does no know which block belongs
> to which file, the entire journal must be applied to the filesystem to
> achieve the expected fsync() symantics (at least, with data=ordered,
> it does).

Lucky we run reiserfs then, I guess...

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: LARGE single-system Cyrus installs?

2007-11-19 Thread Bron Gondwana

On Mon, 19 Nov 2007 22:51:43 -0800, "Vincent Fox" <[EMAIL PROTECTED]> said:
> Bron Gondwana wrote:
> > Lucky we run reiserfs then, I guess...
> >
> >   
> 
> I suppose this is inappropriate topic-drift, but I wouldn't be
> too sanguine about Reiser.  Considering the driving force behind
> it is in a murder trial last I heard, I sure hope the good bits of that
> filesystem get turned over to someone who gives it a future.

There are a bunch of people who know a fair bit about it and have been
happy to help debug issues, including quite recently.  Besides, it's
pretty stable now and isn't bitrotting too badly.

That said, we're hanging out for btrfs to be stable - It would be nice,
and it's sort of inherited a bit from zfs and a bit from reiserfs in its
ways of doing things.

Bron ( running local Maildirs on it right now, synced with offlineimap to
   FM.  I wouldn't dream of running it production yet - it dies horribly
   if you ever fill it more than about 70% )
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: 2.3.10 Upgrade Question

2007-11-20 Thread Bron Gondwana

On Tue, 20 Nov 2007 12:05:43 +0100 (CET), "Simon Matter" <[EMAIL PROTECTED]> 
said:
> > In a NON-replicated setup, do the changes to the GUID have an
> > impact?  Can I just put 2.3.10 on with a quick restart of the
> > mailsystem, or is there More To It?
> >
> > I have 1.7TB of mail, about 40K mailboxes, about 10 million pieces of
> > mail.  So I don't want to do an upgrade which will kick off some huge
> > rebuild-fest without planning.
> 
> I have a server with ~ half the size and upgrade has worked without doing
> anything special.

The index files are pretty small, and they rebuild fast :)  They get streamed
into new copies every single expunge anyway.

Also, it will only upgrade each index as it is opened for the first time, so
the load isn't one giant hit.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: 2.3.10 Upgrade Question

2007-11-20 Thread Bron Gondwana

On Tue, 20 Nov 2007 11:54:24 +, "Ian G Batten" <[EMAIL PROTECTED]> said:
> 
> On 20 Nov 07, at 1146, Bron Gondwana wrote:
> 
> >
> > The index files are pretty small, and they rebuild fast :)  They  
> > get streamed
> > into new copies every single expunge anyway.
> 
> What's involved in the rebuild?  I have users with tens of thousands  
> of messages in a single mailbox, so delaying that open while it reads  
> and indexes a few gigs is bad.  Does the rebuild need to open every  
> message?  Read headers?  Bodies?  What?

No - thankfully everything you need is in the index record itself.
If you want the new sha1 GUIDs you need to reconstruct with the -G
flag, but that's not required for the upgrade.  The upgrade just pads
the GUID field with zeros.

It's really quick, we had folders with 200k messages in them, and the
upgrade was only a couple of seconds on a pretty busy server.  It's still
only 96 bytes per record or something, so that's pretty much streaming what,
18Mb or something - hardly giant given that it's all sequential reads and
writes, and the CPU isn't doing much work.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


FastMail.FM Patchset Updated

2007-11-20 Thread Bron Gondwana
As usual you can get the patches here:

http://cyrus.brong.fastmail.fm/


I've been busy with Cyrus _again_ - so much for my theory
that I was taking a break.

OK - here's what's new.

* http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-bugfixes-2.3.10.diff
  http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-robustify-2.3.10.diff

  Skiplist issues - there were two things that could be found
  in recovery that actually bit us during the whole "restart
  every single store with the new skiplist code" project the
  other day.  ADD where the record already existed and DELETE
  where it didn't.  The later also had an obnoxious bug where
  it would instead delete _the_alphabetically_NEXT_record_
  silently.  Ooops.

  I rolled these two into my bugfix and robustify patches, not
  realising Ken had already applied the previous copies upstream.
  Ken - do you want me to break this out as a separate patch on
  top of the others?

* http://cyrus.brong.fastmail.fm/patches/cyrus-sync-renamedmailbox-2.3.10.diff

  DelayedDelete of entire users was causing excess copying.
  This fixes it, but the solution is less than ideal and causes
  excess messages about folders not existing during an account
  create.  Annoying.  I'd like a better fix, but this is enough
  for now.  Found this one after fixing...

* http://cyrus.brong.fastmail.fm/patches/cyrus-deletemode-userfix-2.3.10.diff

  This is for upstream.  I made a bogus design decision in the 
  DelayedDelete code that Ken accepted, and it was causing
  bailouts and all sorts of yuckyness.  Made the conditions for
  allowing a folder rename into the DELETED. namespace a lot more
  explicit and correct rather than DELETED.user.foo.TIMESTAMP
  being considered a user's mailbox!  The user cleanup script
  no longer causes massive bailouts on sync.

* http://cyrus.brong.fastmail.fm/patches/cyrus-expunged-nocache-2.3.10.diff

  A little thing to shut up the issue that used to cause segfaults 
  and now just causes logging instead.  Cache offsets in the .expunge
  file can be bogus for deeper architectural reasons.  Rather than
  fix the underlying reasons I just ignore them completely when
  running cyr_expire.  At least that way we're not reading bogus
  cache records.

* http://cyrus.brong.fastmail.fm/patches/cyrus-fastrename-2.3.9.diff

  UPDATED.  It turns out it really doesn't matter what YOU can see
  when you're checking if you can use FastRename.  It matters if
  there are subfolders at all.  Change to passing isadmin true
  and not passing the username to mboxlist_count_inferiors().
  Also need to check if the target path has inferiors to avoid
  log messages and partial move failures that have to back out.
  Much nicer this way.  This means fastrename on replicas isn't
  totally broken any more (before, it would never see the subfolders
  because the replication user didn't have ACLs on them and isadmin
  was being set to false explicitly)


Ken - I'd love to see the deletemode-userfix and skiplist stuff
go upstream.  I know you're not happy with fastrename yet, and
fair enough - it's an extra risk and if a shutdown happens in 
the middle of the operation things can get very confused!  The
other two patches are not really long-term good for the Cyrus
codebase so I'd prefer to fix the underlying issues instead :)

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: FastMail.FM Patchset Updated

2007-11-21 Thread Bron Gondwana

On Wed, 21 Nov 2007 07:40:21 +0100 (CET), "Simon Matter" <[EMAIL PROTECTED]> 
said:

> Hi Bron,
> 
> Did you consider this one
> http://bugzilla.andrew.cmu.edu/show_bug.cgi?id=3006 in the patch above?
> From a quick look it seems both patches conflict, is #3006 obsolete now?

You're right, they probably do conflict and Ken has applied 3006.  I should
have checked on that.  Hmm...

Actually, my patch doesn't even fix #3006.  It does sort of suck to be working
from a stable past point rather than the latest CVS actually when I do these
patches - though it's nice from a stability point of view.

OK - Ken, would you like me to re-build this patch on top of current CVS, or
would you like to do it yourself?  My patch has the replication bugfixes which
are still worth doing, but Simon's logic is actually correct for the test
while mine isn't (though I renamed the function he's using...)

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: reconstruct -u or -U

2007-11-21 Thread Bron Gondwana
On Wed, Nov 21, 2007 at 03:06:39PM -0800, Andrew Morgan wrote:
> The changelog says:
> 
>Added -u and -U options to reconstruct -- courtesy of David Carter.
> 
> But I can't find those options listed in the manpage or the built-in help 
> of reconstruct.  What do those options do?

The changelog needs to be updated.  They got renamed as -g and -G.
They either wipe or calculate the message GUID (globally unique
identifier, we hope!) which is the sha1 of the message contents
now.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Murder in replicated mode

2007-11-21 Thread Bron Gondwana
On Wed, Nov 21, 2007 at 09:41:15PM -0300, Diego Woitasen wrote:
> Hi!
>   I'm trying to setup murder in replicated mode. My schema is:
>   
>   -Two servers + one shared storage
>   -Redhat Cluster Suite (RHEL 5.1) with GFS2 working.
>   -Cyrus 2.3.10 in both servers working.
>   -spool and sieve directories on GFS
>   -config dir on local filesystems.
> 
>   I have configured Cyrus in replicated mode but when I create an
>   account in the master server, it isn't replicated unless Cyrus
>   is restart in either node. It isnt't an authentication problem,
>   when I create an account with cyradmin mupdate do nothing.

>From memory you may have to actually deliver a message to one of
the user's folders before it will replicate.  At least that's what
the docs suggested.  Of course we always LMTP deliver a message
right at creation time, so we never bothered to check this.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: FastMail.FM Patchset Updated

2007-11-21 Thread Bron Gondwana
On Wed, Nov 21, 2007 at 06:37:17AM -0500, Ken Murchison wrote:
> Bron Gondwana wrote:
>> On Wed, 21 Nov 2007 07:40:21 +0100 (CET), "Simon Matter" 
>> <[EMAIL PROTECTED]> said:
>>> Hi Bron,
>>>
>>> Did you consider this one
>>> http://bugzilla.andrew.cmu.edu/show_bug.cgi?id=3006 in the patch above?
>>> From a quick look it seems both patches conflict, is #3006 obsolete now?
>> You're right, they probably do conflict and Ken has applied 3006.  I 
>> should
>> have checked on that.  Hmm...
>> Actually, my patch doesn't even fix #3006.  It does sort of suck to be 
>> working
>> from a stable past point rather than the latest CVS actually when I do 
>> these
>> patches - though it's nice from a stability point of view.
>> OK - Ken, would you like me to re-build this patch on top of current CVS, 
>> or
>> would you like to do it yourself?  My patch has the replication bugfixes 
>> which
>> are still worth doing, but Simon's logic is actually correct for the test
>> while mine isn't (though I renamed the function he's using...)
>
> If you have the time.  I'm off banging on other things at the moment.

Attached are tidied up versions of both my earlier patches on top of
what you've already got in CVS.  They apply fine on top of current
CVS trunk:

[EMAIL PROTECTED]:/work/cvs/cyrus-imapd# patch -p1 < 
/work/cyrus-imapd-2.3.10/patches/cyrus-skiplist-newfixes-2.3.10.diff
patching file lib/cyrusdb_skiplist.c
Hunk #1 succeeded at 2113 (offset -1 lines).
Hunk #2 succeeded at 2140 (offset -1 lines).
Hunk #3 succeeded at 2166 (offset -1 lines).
Hunk #4 succeeded at 2206 (offset -1 lines).
[EMAIL PROTECTED]:/work/cvs/cyrus-imapd# patch -p1 < 
/work/cyrus-imapd-2.3.10/patches/cyrus-deletemode-userfix-2.3.10.diff
patching file imap/mboxname.c
patching file imap/mboxlist.c
patching file imap/mboxlist.h
patching file imap/mboxname.h
patching file imap/imapd.c
Hunk #1 succeeded at 5006 (offset 2 lines).
Hunk #2 succeeded at 5098 (offset 2 lines).


Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: FastMail.FM Patchset Updated (new patches)

2007-11-21 Thread Bron Gondwana
On Wed, Nov 21, 2007 at 06:37:17AM -0500, Ken Murchison wrote:
> Bron Gondwana wrote:
>> On Wed, 21 Nov 2007 07:40:21 +0100 (CET), "Simon Matter" 
>> <[EMAIL PROTECTED]> said:
>>> Hi Bron,
>>>
>>> Did you consider this one
>>> http://bugzilla.andrew.cmu.edu/show_bug.cgi?id=3006 in the patch above?
>>> From a quick look it seems both patches conflict, is #3006 obsolete now?
>> You're right, they probably do conflict and Ken has applied 3006.  I 
>> should
>> have checked on that.  Hmm...
>> Actually, my patch doesn't even fix #3006.  It does sort of suck to be 
>> working
>> from a stable past point rather than the latest CVS actually when I do 
>> these
>> patches - though it's nice from a stability point of view.
>> OK - Ken, would you like me to re-build this patch on top of current CVS, 
>> or
>> would you like to do it yourself?  My patch has the replication bugfixes 
>> which
>> are still worth doing, but Simon's logic is actually correct for the test
>> while mine isn't (though I renamed the function he's using...)
>
> If you have the time.  I'm off banging on other things at the moment.

I guess I should attach them!

Bron.
Fix Delayed Delete replication

Candidate for upstream - now respun on top of
   Simon Matter's allowusermoves fixes from CVS

Deleting user.foo was broken on replicas thanks to
me choosing a bad method for making it work.

Also, it was leaving foo.TIMESTAMP.sub files because
it considered that to be a valid "user".  Oops.

This patch takes a different approach (on top of the
delayed delete code already in Cyrus 2.3.10):

1) mboxname_isusermailbox() no longer returns true for
   DELETED.user.foo.TIMESTAMP

2) The "can I rename this" code checks for the target
   name being either another username or something in the
   DELETED. namespace.
2a) mboxname_isdeletedmailbox() rather than the somewhat
   (IMHO) misplaced mboxlist_in_deletedhierarchy().

3) if "forceuser" is passed (only done by sync_server),
   then all ACL checks are completely bypassed, so the
   rename will always succeed on the replica.

Index: cyrus-imapd-2.3.10/imap/mboxname.c
===
--- cyrus-imapd-2.3.10.orig/imap/mboxname.c	2007-11-22 01:22:08.0 -0500
+++ cyrus-imapd-2.3.10/imap/mboxname.c	2007-11-22 01:29:17.0 -0500
@@ -600,35 +600,47 @@
 {
 const char *p;
 const char *start = name;
-const char *deletedprefix = config_getstring(IMAPOPT_DELETEDPREFIX);
-size_t len = strlen(deletedprefix);
-int isdel = 0;
 
 /* step past the domain part */
 if (config_virtdomains && (p = strchr(start, '!')))
 	start = p + 1;
 
-/* step past any deletedprefix */
-if (mboxlist_delayed_delete_isenabled() && strlen(start) > len+1 &&
-	!strncmp(start, deletedprefix, len) && start[len] == '.')  {
-	start += len + 1;
-	isdel = 1; /* there's an add'l sep + hextimestamp on isdel folders */
-}
-
 /* starts with "user." AND
  * we don't care if it's an inbox OR
- * there's no dots after the username OR 
- * it's deleted and there's only one more dot
+ * there's no dots after the username 
  */
-if (!strncmp(start, "user.", 5) &&
-	(!isinbox || !strchr(start+5, '.') ||
-	 (isdel && (p = strchr(start+5, '.')) && !strchr(p+1, '.'
-	return (char*) start+5; /* could have trailing bits if isinbox+isdel */
+if (!strncmp(start, "user.", 5) && (!isinbox || !strchr(start+5, '.')))
+	return (char*) start+5;
 else
 	return NULL;
 }
 
 /*
+ * If (internal) mailbox 'name' is a DELETED mailbox
+ * returns boolean
+ */
+int mboxname_isdeletedmailbox(const char *name)
+{
+static const char *deletedprefix = NULL;
+static int deletedprefix_len = 0;
+int domainlen = 0;
+char *p;
+
+if (!mboxlist_delayed_delete_isenabled()) return(0);
+
+if (!deletedprefix) {
+	deletedprefix = config_getstring(IMAPOPT_DELETEDPREFIX);
+  deletedprefix_len = strlen(deletedprefix);
+}
+
+if (config_virtdomains && (p = strchr(name, '!')))
+	domainlen = p - name + 1;
+
+return ((!strncmp(name + domainlen, deletedprefix, deletedprefix_len) &&
+	 name[domainlen + deletedprefix_len] == '.') ? 1 : 0);
+}
+
+/*
  * Translate (internal) inboxname into corresponding userid.
  */
 char *mboxname_inbox_touserid(const char *inboxname)
Index: cyrus-imapd-2.3.10/imap/mboxlist.c
===
--- cyrus-imapd-2.3.10.orig/imap/mboxlist.c	2007-11-22 01:29:03.0 -0500

Re: Murder in replicated mode

2007-11-22 Thread Bron Gondwana
On Thu, Nov 22, 2007 at 11:01:30PM -0300, Diego Woitasen wrote:
> On Thu, Nov 22, 2007 at 12:50:47PM +1100, Bron Gondwana wrote:
> > On Wed, Nov 21, 2007 at 09:41:15PM -0300, Diego Woitasen wrote:
> > > Hi!
> > >   I'm trying to setup murder in replicated mode. My schema is:
> > >   
> > >   -Two servers + one shared storage
> > >   -Redhat Cluster Suite (RHEL 5.1) with GFS2 working.
> > >   -Cyrus 2.3.10 in both servers working.
> > >   -spool and sieve directories on GFS
> > >   -config dir on local filesystems.
> > > 
> > >   I have configured Cyrus in replicated mode but when I create an
> > >   account in the master server, it isn't replicated unless Cyrus
> > >   is restart in either node. It isnt't an authentication problem,
> > >   when I create an account with cyradmin mupdate do nothing.
> > 
> > >From memory you may have to actually deliver a message to one of
> > the user's folders before it will replicate.  At least that's what
> > the docs suggested.  Of course we always LMTP deliver a message
> > right at creation time, so we never bothered to check this.
> > 
> > Bron.
> 
> I tried with that, but doesn't work. I delivered a message in both
> serves and nothing. Again, the mailbox was replicated when I restart
> Cyrus.
> 
> Other idea? 
> 
> May be I should start to read the mupdate code ... :)

Is rolling replication actually running?  As in - do existing mailboxes
continue to be replicated?

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Restricting admin logins

2007-11-29 Thread Bron Gondwana
On Thu, Nov 29, 2007 at 03:54:29PM +0100, Alain Spineux wrote:
> On Nov 29, 2007 3:15 PM, Andy Fiddaman <[EMAIL PROTECTED]> wrote:
> >
> > At the moment we patch the Cyrus IMAP server source so that administrators
> > (admins in the config file) can only log in from certain IP addresses.
> >
> > I was wondering if there is a better way to do this or whether some means
> > of achieving this is planned for future releases?
> 
> Yes have 3 imapd.conf, all common option in one imapd_common.conf
> and @include this file in the two other with different admins options
> Then start two different port and some firewall rules to achieve your need.

Hey, that's a pretty funky idea :)

We use a nginx proxy with an authentication daemon which rejects all
login attempts as admin.  Our imap machines are firewalled so that
the only ways you can talk to them are imap or pop via the nginx proxy
or send incoming emails to our mxes which will inject them via lmtp to
the spam scanning machines which do the final delivery.

I do like the different configs for a simpler network layout in a
smaller system though.  Very clever!

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus upgrade from 2.1.18 to 2.2.13 moved email messages

2007-12-02 Thread Bron Gondwana
On Sun, Dec 02, 2007 at 08:51:51PM +0100, Steinar Bang wrote:
> > Steinar Bang <[EMAIL PROTECTED]>:
> 
> > Steinar Bang <[EMAIL PROTECTED]>:
> > Sebastian Hagedorn <[EMAIL PROTECTED]>:
> 
>  What previously was mail/s/user/sb/ is now mail/u/s/user/sb/
> 
> Here's what I think happened.
> 
> I've had this setting since upgrading from 1.5.19 to 2.1.11 in 2002:
> 
> >  hashimapspool: true
> 
> 
> Ie. it's not new.
> 
> So when I ran this command meant for an upgrade from 1.5.* to 2.* things
> where messed up:
> 
> > $ /usr/lib/cyrus/upgrade/rehash basic
> 
> The rehash script expected a 1.5 structure to work with, and when I fed
> it a 2.1 structure, it moved the wrong directories in the wrong place.
> 
> So the question is what I can do to fix it...?
> 
> If I move 
>  mail/u/user/s/sb /
> to
>  mail/s/user/sb/
> would that fix things...?

Yes.

rehash isn't the nicest script in the whole world.  I have a much nicer
one that's part of the userhash patch at FastMail.  Unfortunately it
doesn't (yet) completely clean up after itself so I need to do a bit
more work before I'm happy to recommend it as a complete replacement.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: DBERROR

2007-12-06 Thread Bron Gondwana
On Thu, Dec 06, 2007 at 05:56:29PM +0100, Alain Spineux wrote:
> On Dec 6, 2007 5:03 PM, Jeff Blaine <[EMAIL PROTECTED]> wrote:
> > bash-2.05# ls -l *db
> > -rw---   1 cyrusmail 144 Dec  5 16:56 annotations.db
> > -rw---   1 cyrusmail 144 Dec  5 16:56 deliver.db
> > -rw---   1 cyrusmail 144 Dec  5 16:56 mailboxes.db
> 
> 144 bytes !  Not a lot !

It is precisely the size of an empty skiplist file.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus on Solaris at universities?

2007-12-13 Thread Bron Gondwana
On Thu, Dec 13, 2007 at 11:46:19AM +0200, Janne Peltonen wrote:
> On Thu, Dec 13, 2007 at 08:26:03AM +0100, Rudy Gevaert wrote:
> > Vincent Fox wrote:
> > > Just wondering what other universities are runing Cyrus on Solaris?
> > > 
> > > We know of:
> > > CMU
> > > UCSB
> > 
> > We have run it, but switched to GNU/Linux about one year and a half ago.
> 
> Same here, except the switch was half a year ago (and long due).
> 
> This said, the main reason we switched to Linux wasn't that there were
> anything wrong with Solaris - but we only have one staff member these
> days that really knows his way around Solaris, whereas there are many
> people with good to excellent Linux competence.

The other reason I'm really happy with Linux is that I don't think I
could post a casual "we've got this weird issue with a recent release
that didn't exist before" to an unrelated topic on a Solaris list and
have the Linus equivalent respond within half an hour explaining what
has probably caused it, giving a couple of patches to try, and pulling
the other main "owners" of that block of code into the discussion.

And then others on the list guiding me through converting those snippets
of code into a supportable, maintainable patch that adds a /proc toggle
to alter the behaviour of the kernel for what we need.  It will probably
be in .25 if it survives the -mm process.

Bron ( nowhere near as much as the amount of code I've written for
   Cyrus this year though! )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus on Solaris at universities?

2007-12-13 Thread Bron Gondwana
On Thu, Dec 13, 2007 at 04:22:45PM -0800, Vincent Fox wrote:
> Bron Gondwana wrote:
> 
> (Linux comments)
> 
> 'Twas not my intent to start a this-OS versus that-OS comparison.
> Valid though that is, it's a different thread.
> 
> Like most sites, we have various OS in operation here, it just
> happens that the Cyrus backends are Solaris.  The test project
> here started with spare Sun hardware after all.  And you know
> after working with ZFS for a while, dealing with fsck on a large
> mail volume is something I'm very glad to leave behind.

Oh yeah - I'll be glad when that's gone as well.  That's the biggest
downside to using Linux - and if I had the time and resources I'd
be tempted to try a couple of Solaris-on-Intel installs on a couple
of machines and see if it was workable.

I know some people are quite happy on FreeBSD as well - Cyrus really
is quite portable C.  (I think John Capo uses FreeBSD?)

Still, after using Linux (especially Debian) package management and
userland, maintaing a Solaris machine really does feel like a step
back into the dark ages in some respects.  Having thousands of
well packaged applications at your fingertips is pretty handy, and
it's all well integrated.

I do have one ZFS machine, and I don't use it to anywhere near its
capabilities - it's just backups.

Bron ( advocacy 'r' us! )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Plugging into the imap system

2007-12-22 Thread Bron Gondwana
On Sat, Dec 22, 2007 at 04:51:46PM -0300, Diego M. Vadell wrote:
> Hi Gabriele,
>If you are using linux, maybe you can use inotify-tools to notify you of 
> any change in cyrus' spool. 

Be aware that the files on disk are created _before_ the index record,
so you need to wait or poll until the index record has had a chance to
be created.

Even if you capture the "append" event on the cyrus.index file itself,
you still won't see the new record until the 'exists' record in the
index header is updated.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Plaintext only for loopback?

2008-01-13 Thread Bron Gondwana
On Sun, Jan 13, 2008 at 01:59:48AM -0500, Chris Pepper wrote:
> Hello,
> 
>   I want to allow plaintext auth only for SquirrelMail (running on the 
> Cyrus IMAPd server), and require encrypted authentication over all 
> physical network connections. I see several options governing plaintext 
> auth in the documentation for imap.conf:

Run two imapd instances from cyrus.conf, one on a high port that you
firewall from everywhere but the squirrelmail server, and the other
config on the standard port deny plaintext.  Then just point
squirrelmail at the high port in its config.

You just need to specify "-C /etc/imapd-sm.conf" or something for the
squirrelmail one.  Personally I would generate both from a template
stored in version control.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Plaintext only for loopback?

2008-01-14 Thread Bron Gondwana
On Sun, Jan 13, 2008 at 07:09:25PM -0800, Phil Pennock wrote:
> It's been a little while since I've done this, so I'm not absolutely
> sure of the details, but if memory serves ...
> 
> Run two different IMAP services from cyrus.conf:
> 
>   SERVICES {
> imap   cmd="imapd" listen="imap.example.org:imap" prefork=1
> imaplocal  cmd="imapd" listen="localhost:imap" prefork=2
>   }
> 
> In imapd.conf you can prefix a configuration directive with the service
> name, where the service name is exactly what you specified in SERVICES;
> 
>   imaplocal_allowplaintext: 1
> 
> All possibly wrong, as I say; worth a try though.

Yeah, that's probably actually a better idea than my one in an
environment where you don't have pervasive version control and
templated config file generation in place, otherwise it would
be too easy for the two files to get out of sync when someone
hand edits one of them - with all the debugging nightmare and
strange effects that would follow.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: segfault of imapd and pop3d

2008-01-15 Thread Bron Gondwana
On Tue, Jan 15, 2008 at 09:18:27AM +0100, Michael Menge wrote:
> Hi,
>
> I found some segmentation faults of my "imapd -s"  and "pop3d -s""proceses
> in my logs.Has anybody seen this before. I'm running cyrus 2.3.8 on an
> SLES10 x86_64.
>
> Jan 14 09:37:44 mailserv08 kernel: imapd[1898]: segfault at 
> 2b9042669978 rip
> 004805d4 rsp 7fffdf6b51c0 error 4
> Jan 14 10:11:44 mailserv08 kernel: imapd[32213]: segfault at 
> 2b1c29b16978 rip
> 004805d4 rsp 781f9d00 error 4
> Jan 14 10:13:47 mailserv08 kernel: imapd[30415]: segfault at 
> 2b6440346978 rip
> 004805d4 rsp 7fffe19d4360 error 4
> Jan 14 10:17:51 mailserv08 kernel: imapd[30420]: segfault at 
> 2adb7dd67978 rip
> 004805d4 rsp 7fffa3fb3ab0 error 4
> Jan 14 10:53:18 mailserv08 kernel: imapd[2072]: segfault at 
> 2ae8ce23e978 rip
> 004805d4 rsp 7fff5368d190 error 4
> Jan 14 11:07:14 mailserv08 kernel: pop3d[3402]: segfault at 
> 2b29a2f6a978 rip
> 0043fe4c rsp 7fff7e8fc510 error 4
> Jan 14 11:33:41 mailserv08 kernel: imapd[3797]: segfault at 
> 2b5a2e971978 rip
> 004805d4 rsp 733aaeb0 error 4
> Jan 14 12:00:20 mailserv08 kernel: pop3d[5015]: segfault at 
> 2ad8925f3978 rip
> 0043fe4c rsp 7fff8f72b340 error 4
>
> The last corresponding entry in cyrus logs is always "accepted connection"
>
> Jan 14 11:33:41 mailserv08 imaps[3797]: accepted connection
> Jan 14 12:00:20 mailserv08 pop3s[5015]: accepted connection

Sounds like a corrupted mailbox to me - there's heaps of stuff where
cyrus mmaps a file and blindly goes adding offsets read from it to
the mmap base and accessing the memory there.  Recipe for segfaults
if I ever saw one.

Do you know which user is logging in when this happens?  They're
rare enough that I assume it's not every user on the box!

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: segfault of imapd and pop3d

2008-01-15 Thread Bron Gondwana

On Tue, 15 Jan 2008 16:33:50 +0100, "Michael Menge" <[EMAIL PROTECTED]> said:
> Quoting Bron Gondwana <[EMAIL PROTECTED]>:
> 
> 
> Syslog is logging Cyrus at debug level, and the last line for that
> process is
> "accepted connection". Normaly this line is followd by a STARTTLS line  
> and a LOGIN line.
> proc/pid shows the host they are connected with. These IPs are  
> different for most processes and i see other succesfull logins from  
> these hosts.
> 
> So i don't think it's a corrupt mailbox. Maybe the tls_sessions.db is
> corrupt.
> Is there a way to check a skiplist if the file is corrupted?

Personally I like sudo -u cyrus cyr_dbtool $file skiplist show > /dev/null

This does a "foreach" over the entire file which should find any issues.  If
there was an exposed way to run "myconsistent" on the file that would be nicer
but the cyrus DB interface doesn't expose all the internals and I've been loath
to fiddle with it since there are a few different DB modules and I'd have to
change all of them.

Of course if you can shut the cyrus instance down you could probably just
delete that file.

Bron.

P.S. I've attached another external tool that can find some interesting things.
It needs the attached module to be in your perl lib somewhere too.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]



skiplist_detail.pl
Description: Perl program


Skiplist.pm
Description: Perl program

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: Move to new server

2008-02-19 Thread Bron Gondwana

On Tue, 19 Feb 2008 14:13:38 +0100, "Paul van der Vlis" <[EMAIL PROTECTED]> 
said:
> Adam Tauno Williams schreef:
> >> I want to move all mail to a new server. Old server has Cyrus 2.1.18
> >> (Debian Sarge), new server has Cyrus 2.2.13 (Debian Etch).
> >> In the past, I just copied all files in
> >> /var/spool/cyrus/
> >> /var/lib/cyrus
> >> But, is this a good way?
> > 
> > It probably works.
> 
> That's true, but maybe I keep old database-formats ?
> 
> >> Alternative is imapcopy. But I see you need a list of all users and
> >> passwords. That's a lot work to make (650 users). 
> > 
> > Or just connect as a user with administrative access.  We did a
> > migration with imapcopy,  no need to know all the user's passwords.
> > 
> >> Isn't it possible to use the admin-user to copy everything?
> > 
> > Yep.
> 
> Nice to hear, thanks!
> 
> Can I use something like this in ImapCopy.cfg ?
> 
> #
> # List of users and passwords
> #
> #   SourceUserSourcePassword   DestinationUser DestinationPw
> Copy"cyrus"   "cyruspw""cyrus" "cyruspw"

Does this copy all seen information as well?  Seen is per-user in Cyrus,
so you won't see it if the admin user does the copying.

(and I hate losing seen information!)

Bron ( yes, I do have an alternative to suggest )
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Endgame: Cyrus big install at UC Davis

2008-02-21 Thread Bron Gondwana

On Tue, 19 Feb 2008 12:50:09 -0800, "Vincent Fox" <[EMAIL PROTECTED]> said:
> So for those of you who recall back that far.
> 
> UC Davis switched to Cyrus and as soon as fall quarter started
> and students started hitting our servers hard, they collapsed.
> Load would go up to what SEEMED to be (for a 32-core T2000)
> a moderate value of 5+ and then performance would fall off a cliff.
> People would be getting timed out, overall it was REALLY bad
> here for several days, lots of pressure
> 
> We are running Solaris10 u4 and using a ZFS pool for the mail store.

Have you read this?

http://blogs.sun.com/roch/entry/when_to_and_not_to

I was forwarded there via:

http://rlwinm.blogspot.com/2007/10/my-parity-iz-pastede-on-yay.html

which I got to via a slashdot comment pointed out on a mailing list,
etc, etc.

Anyway, it looks quite interesting.  Thankfully our only use of ZFS
at the moment is the backup machine where each user's backup consists
of just two files (a sqlite DB and a .tar.gz).  I've already posted to
this list at length about how it works (very well, thankyou very much)
and the nice side effect that it detects file corruption automatically
by recalculating the GUID by doing a sha1 on the file as it goes into
the tar, so it can find any underlying issues in the cyrus spools.

Actually, every so often I'm tempted to turn the sqlite database back
into a flat file and gzip that as well - it would cost a bit more to
read, but the IO would _all_ be streaming then, and we have plenty of
memory.  We can still just append the changes.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: GUID change to SHA1

2008-02-21 Thread Bron Gondwana
On Thu, Feb 21, 2008 at 05:11:48PM +0100, Martin Schweizer wrote:
> Hello
> 
> I use FreeBSD 6.3 and cyrus 2.3.11.  Below is the manual for the change.
> 
> Upgrading from 2.3.9
> 
>  * The method used for generating Globally Unique IDentifiers used
> for replication has been changed to be the SHA1
>hash of the messages. If you wish to upgrade the existing GUIDs
> in particular mailbox(es) or the entire server,
>perform the following steps in the listed order. Note that is
> is NOT REQUIRED that existing GUIDs be upgraded.
>  1. Zero GUIDs on the replica (reconstruct -g)
>  2. Regenerate GUIDs on the master (reconstruct -G)
>  3. Regenerate GUIDs on the replica (reconstruct -G)
> 
> Which is the master and which is the replica? Server 1 or Server 2?
> [...]
> Server 1:
>   syncclient   cmd="/usr/local/cyrus/bin/sync_client -r"

Master

> Server 2:
>   syncservercmd="/usr/local/cyrus/bin/sync_server" listen="csync" 
> prefork=0

Replica


Enjoy,

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Deleted cyrus.* files

2008-02-21 Thread Bron Gondwana
On Thu, Feb 21, 2008 at 11:26:49AM +0100, David Flegl wrote:
> Hi,
> 
>  > try reconstruct from command line.
>  > 1. login as cyrus.
>  > 2. /usr/local/cyrus/bin/reconstruct -r user/bad.user
> 
> Thank's for a reply. I've tried but no effect. Reconstruct said:
> $>/usr/local/cyrus/bin/reconstruct -r user/[EMAIL PROTECTED]
> domain.cz!user.bad^user: Mailbox has an invalid format
> 
> and when I've tried this (without domain):
> $>/usr/local/cyrus/bin/reconstruct -r user/bad.user
> $>
> Command has no response. And no log information.

Try this:

/usr/local/cyrus/bin/reconstruct -rf domain.cz\!user/bad.user

Reconstruct uses the "internal" mailbox format, which is

domain.name\!user/username rather than user/[EMAIL PROTECTED]

Regards,

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Abusing the sync protocol for fun and profit.

2008-02-21 Thread Bron Gondwana

On Thu, 21 Feb 2008 09:20:34 -0600, "Dan White" <[EMAIL PROTECTED]> said:
> Bron Gondwana wrote:
> > Attached are three perl modules,
> > 
> > Cyrus/SyncClient.pm
> > Cyrus/ImapReplica.pm
> > Mail/IMAPTalk.pm
> > 
> > I'm including this copy of Mail::IMAPTalk because without it, the clever
> > 'literal' stuff doesn't work properly.  I'll prod Rob to clean it up and
> > re-package it and push it to CPAN so I can depend on that version and
> > have things all be happier.
> 
> Thanks Bron,
> 
> This works great for me. I'm able to synchronize between my old 
> 2.1.17 server, with a perdition proxy frontend end, to my newer 
> 2.3.10 server.

Excellent, that's what I like to hear :)

> I had a hiccup in the SyncClient.pm module during DIGEST-MD5 
> authentication.
> 
> I changed to PLAIN, using the following changes, to get it working:

Wow, that wouldn't work for us at all.  I did have to put -p 1 on the
syncserver command line in cyrus.conf before it would let me authenticate
at all, and nothing but DIGEST-MD5 worked for me.  Also, 
Authen::SASL::Cyrus worked fine, but then the connnection was encrypted and
I had to try and pipe all the IO through it as well, which I couldn't be
bothered with making pipe nicely.

> [diff]

Thanks for that.  I probably should make it try both in order or something
funky like that.  Maybe an "auth_digestmd5" and an "auth_plain" function
which are tried in that order.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Again: GUID change to SHA1

2008-02-25 Thread Bron Gondwana
On Mon, Feb 25, 2008 at 05:52:34PM +0100, Martin Schweizer wrote:
> Hello
> 
>  2008/2/21, Bron Gondwana <[EMAIL PROTECTED]>:
> 
> > On Thu, Feb 21, 2008 at 05:11:48PM +0100, Martin Schweizer wrote:
>  >  > Hello
>  >  >
>  >  > I use FreeBSD 6.3 and cyrus 2.3.11.  Below is the manual for the change.
>  >  >
>  >  > Upgrading from 2.3.9
>  >  >
>  >  >  * The method used for generating Globally Unique IDentifiers used
>  >  > for replication has been changed to be the SHA1
>  >  >hash of the messages. If you wish to upgrade the existing GUIDs
>  >  > in particular mailbox(es) or the entire server,
>  >  >perform the following steps in the listed order. Note that is
>  >  > is NOT REQUIRED that existing GUIDs be upgraded.
>  >  >  1. Zero GUIDs on the replica (reconstruct -g)
>  >  >  2. Regenerate GUIDs on the master (reconstruct -G)
>  >  >  3. Regenerate GUIDs on the replica (reconstruct -G)
>  >  >
>  >  > Which is the master and which is the replica? Server 1 or Server 2?
>  >
>  > > [...]
>  >  > Server 1:
>  >
>  > >   syncclient   cmd="/usr/local/cyrus/bin/sync_client -r"
>  >
>  >
>  > Master
>  >
>  >  > Server 2:
>  >
>  > >   syncservercmd="/usr/local/cyrus/bin/sync_server"
> listen="csync" prefork=0
>  >
>  >
>  > Replica
> 
> 
> Thanks for the hint. I was not sure about the terms.
> 
>  On the master I get after the changes (used of reconstruct...) the
>  following output
> 
>  grep sync /var/log/debug.log
> 
>  Feb 22 09:09:03 acsvfbsd02 sync_client[73967]: DIGEST-MD5 client step 1
>  Feb 22 09:09:03 acsvfbsd02 sync_client[73967]: DIGEST-MD5 client step 2
>  Feb 22 09:09:03 acsvfbsd02 sync_client[73967]: DIGEST-MD5 client step 3
>  Feb 22 09:15:09 acsvfbsd02 sync_client[73967]: seen_db: user astomas
>  opened /var/imap/user/a/astomas.seen
>  Feb 22 09:19:04 acsvfbsd02 sync_client[74023]: DIGEST-MD5 client step 1
>  Feb 22 09:19:04 acsvfbsd02 sync_client[74023]: DIGEST-MD5 client step 2
>  Feb 22 09:19:04 acsvfbsd02 sync_client[74023]: DIGEST-MD5 client step 3
> 
>  Is that correct? This I  got also before I did the changes.

Sorry about the delay in replying.  That just looks like debugging info.
We don't get it, so I wonder if you have a debugging level turned on
that we don't.

The important question is: are your messages being replicated?

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Fwd: Again: GUID change to SHA1

2008-03-05 Thread Bron Gondwana

On Wed, 5 Mar 2008 07:01:13 +0100, "Martin Schweizer" <[EMAIL PROTECTED]> said:
> ( Sorry,again but I did not get an answer until now)

OOps, I thought I had replied...

>  > Sorry about the delay in replying.  That just looks like debugging info.
>  >  We don't get it, so I wonder if you have a debugging level turned on
>  >  that we don't.
> 
> 
> Here my imapd.conf from the "client":
> [...]
>  sasl_log_level: 7

I'd say that's what is causing it.  Any reason you have this turned up so far?

>  >  The important question is: are your messages being replicated?
> 
> 
> Yes, the messages will replicated all the time.

Cool - I'd say you're fine then.  Nothing to worry about!

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: delayed expunge/delete with replication?

2008-03-10 Thread Bron Gondwana
On Mon, Mar 10, 2008 at 07:53:52AM -0700, David R Bosso wrote:
> Hello,
> 
> I've got a few questions about how delayed expunge works with replication 
> in 2.3.11.
> 
> 1.  If I want delayed expunge & delete on both the replica and master 
> (replicated delayed expunge & delete), do I need them turned on in both 
> master and replica imapd.conf, or is it sufficient that they be on in just 
> the master?
> 
> 2.  Is it sufficient to run cyr_expire (-D -X) the master - will it be 
> replicated, or does it need to be run on the replica also?

Both on the replica as well.  The replication protocol doesn't know
about expunged messages at all.

This is a source of occasional frustration to me, because it means if
you do anything remotely funky (like a restore) on the master, then
the replica will have the same record in both the .index and .expunge
files and it all goes to pot.

But finding the time to rewrite the protocol to support both .index
and .expunge contents being streamed to the replica sounds an awful
lot like real work, so we haven't done anything about it.

Bron ( but if you're using delete-rename for folders, turn that off
   on the replica.  The replica will just get a "rename" event
   from the master and it's all good )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Migrate all to skiplist?

2008-03-12 Thread Bron Gondwana
On Wed, Mar 12, 2008 at 12:02:30PM -0400, Shelley Waltz wrote:
> 
> I am migrating my 200 users from a RHAS3 cyrus-imapd-2.2.3 install to a
> RHAS5 cyrus-imapd-2.3.7 install.
> 
> I have been happy with the current setup with the exception of issues with
> the Berkeley DB for mailboxes and deliver.  Recovering a corrupt
> maiboxes.db has been extememly slow, taking on the order of 6hrs to
> recover from the flat file dump.  On occasion I have had to delete the
> deliver.db and restart.
> 
> Reading through many posts, is there any reason to not use skiplist for
> all the databases?  Although I have 200 users, at any given time, only
> half are actively using their account.  Our traffic is light for the most
> part.
> 
> The 2.3.7 defaults as listed in "man impad.conf" seem to indicate
> that berkeley-nosync is used for duplicate and tlscache and berkeley for
> ptscache(?)  Flatfile for subscription and quotalegacy for quota.
> 
> This will be a single server with a single replica.  Are there any issues
> with not using skiplist for all under this setup?

I wouldn't use skiplists in 2.3.7 on general principles.  The code was
chock full of bugs for ages (ask the Kolab guys about their experience
with it).  Almost all the fixes got into 2.3.11, though I would also
recommend applying:

http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-safelock-2.3.11.diff

http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-state-2.3.11.diff

and

http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-transactions-2.3.10.diff

on top of 2.3.11.  With these, I have been unable to crash skiplists
with all my nasty tests any more.

Bron ( somewhat an expert on that module of the Cyrus code now - I've
   spent a lot of time reading it! )


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Is there any way to log/see protocol level commands for a mailbox which is not under user.* ?

2008-03-13 Thread Bron Gondwana
On Thu, Mar 13, 2008 at 11:57:31AM +0100, Ciprian Marius Vizitiu wrote:
> Hi listers, 
> 
>  
> 
> It's written in the manuals that by creating a folder under
> /var/lib/imap/log/username Cyrus will log at protocol level details for
> "username". Question: how can I do the same for a mailbox which is above
> user.* level? Of course I could enable logging for all users =:-o and then
> try to correlate those logs but I thought I should ask. 
> 
>  
> 
> Background: on a perfectly functioning Cyrus IMAP some of my users are...
> abusing one common IMAP folder in subtle ways so I just want  to be able to
> catch the offenders that's why I'm only interested in the "COPY" and/or
> "APPEND" commands. 

My "Auditlog" patches would probably be pretty handy for that!  You
would get log entries for every append and copy, associated via
sessionid to a login event.  You'll need both:

cyrus-sessionid-2.3.11.diff and
cyrus-auditlog-2.3.11.diff

from http://cyrus.brong.fastmail.fm/

(and you'll have to build cyrus from source or a custom package for
your packaging system as appropriate)

you also need to add 'auditlog: yes' to the imapd.conf.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: reconstruct doing nothing

2008-03-22 Thread Bron Gondwana

On Sat, 22 Mar 2008 11:29:36 +0100, "Alain Spineux" <[EMAIL PROTECTED]> said:
> On Fri, Mar 21, 2008 at 8:59 PM, Rudy Gevaert <[EMAIL PROTECTED]>
> wrote:
> > Gabor Gombas wrote:
> >  > On Fri, Mar 21, 2008 at 04:57:18PM +0100, Bart Coninckx wrote:
> >  >
> >  >> Gabor, is this patch relevant when I do get a result onscreen from
> >  >> "reconstruct"?
> >  >
> >  > Without the patch, "reconstruct -r" processes only the exact mailbox
> >  > given on the command line but does not descend into subfolders, like
> >  > when the "-r" switch was not given at all. At least that's the case with
> >  > my configuration.
> >
> >  Some time ago I noticed the same, but some time after that it did
> >  recurse.  Anyway, doing reconstruct -rfx user/first.lastname/[EMAIL 
> > PROTECTED]
> >  reconstructs the sub folders too.
> >
> 
> I thing '*' and '%' are used as wildcard in the list of know
> mailboxes, I means the mailbox.db
> 
> Then
> 
> reconstruct user/first.lastname/[EMAIL PROTECTED]  #withount -r
> 
> display all known sub folder of mailbox [EMAIL PROTECTED] except
> Inbox itself 
> 
> Then to repair an inbox and all its folders 2 commands are required !
> 
> reconstruct user/[EMAIL PROTECTED]
> reconstruct user/first.lastname/[EMAIL PROTECTED]
> 
> I suppose
> 
> reconstruct -r user/[EMAIL PROTECTED]  #withount * but with -r
> 
> SOULD do the same. But is not working for me. The -f doest help more if
> the mailbox is already knwon !
> 
> If I copy my Sent folder into a new Foo folder then run (create a new
> mailbox without
> telling to cyrus), then use
> 
> reconstruct -f user/[EMAIL PROTECTED]  # without a -r but
> (work same with -r )
> 
> It display
> 
> discovered domain.com!first.last.Foo
> 
> 
> Conclusion
> 
> 
> - -r looks to be useless
> - -f discover yet unknow folder, recursively too, but only inside new
> folder, not if already known, use * to for a full discovery in two
> time user/[EMAIL PROTECTED] and
> user/first.lastname/[EMAIL PROTECTED]
> - '*' and '%' allow to walk around the mailbox tree, but only inside
> already know folder
> 
> This was tested on a 2.3.11

Try this:

reconstruct -r 'domain.com!user/first.lastname'

(yay internal representations leaking)
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: ctl_mboxlist virtual memory exhausted !

2008-03-25 Thread Bron Gondwana

On Tue, 25 Mar 2008 09:31:40 +0100, "Brasseur Valery" <[EMAIL PROTECTED]> said:
> Hi,
> 
> I am running 2.3.11, with 2 Millions users (4M mailboxes ;-)
> 
> when trying to do a ctl_mboxlist -m, after some time (a few second !) I
> got a "virtual memory exhausted", and i can see that the process is
> allocating more than 3Gb of memory !

Ouch. That hurts

> did some of you encontered this ?
> any way to bypass ?

We split our Cyrus instance up into 300Gb data partitions.  We currently have
56 stores (112 partitions thanks to replication).  Obviously you need 
infrastructure
to manage this, and some form of frontend proxy to direct user logins to the 
correct
store (we use nginx).  Further, any users who need to share mailboxes must be 
on the
same store.

Still, things are a lot faster when your average mailboxes.db is only 4.5Mb in 
size
(having just checked the one for the store my mailbox is on)

> I also got a lot's of skiplist corruption when file size is around 700Mb
> for mailboxes.db, and mupdate process getting 100% of CPU when it's
> arrive !!!
> any ideas are welcome !

Are you using:

http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-safelock-2.3.11.diff
http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-state-2.3.11.diff
http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-transactions-2.3.10.diff

Finally, are you running a 32 bit operating system?  With a 700Mb mailboxes.db
being mmaped, you might be pushing close to the available process memory space.
Running a 64 bit kernel would probably help a lot there (you will of course need
to have 64 bit hardware!)

I would seriously recommend against having 2 million users all on one machine
on disaster recovery principles - it takes far too long to copy that much data
onto modern drives, so if you lose your drive unit then getting users back up
and running looks like about a week's sitting there watching data copy.  Yes,
I have done that before.  That's why we run partitions that can rebuild from
scratch in 6 hours now.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Sieve forwarding loop destroys e-mail

2008-03-31 Thread Bron Gondwana
On Mon, Mar 31, 2008 at 04:21:20PM +0200, Alain Spineux wrote:
> On Mon, Mar 31, 2008 at 2:40 PM, Joseph Brennan <[EMAIL PROTECTED]> wrote:
> >
> >  Jo Rhett <[EMAIL PROTECTED]> wrote:
> >
> >  >  I would ask that you spend some time determining how the
> >  > program could determine it is a bad rule, and provide a patch to fix this
> >  > behavior.  (in short -- it's harder than you think)
> >
> >  A mail delivery system that loses mail is buggy.  I don't need to look
> >  at the code to know that.
> >
> >  You can tell me no one has time to fix it, and in an open source project
> >  I can respect that.  But it is a bug.
> 
> Look at this:
> 
> If my script is
> 
> redirect [EMAIL PROTECTED]
> 
> I expect my mailbox to stay empty, because this is what redirect is
> supposed to do!
> If I found and email in my mailbox this is a BUG, because the script I wrote
> should never let an email come in!

I know, I know - pick me.  How about this one?

discard;


It turns out that a mail delivery system that has been configured in a
way that loses mail has a bug _in_the_person_who_configured_it_.  Now
it may be that the language makes it easy to shoot yourself in the foot,
but that's different from being buggy.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Sieve forwarding loop destroys e-mail

2008-03-31 Thread Bron Gondwana

On Mon, 31 Mar 2008 15:51:17 -0700 (PDT), "Andrew Morgan" <[EMAIL PROTECTED]> 
said:
> On Tue, 1 Apr 2008, Bron Gondwana wrote:
> 
> > On Mon, Mar 31, 2008 at 04:21:20PM +0200, Alain Spineux wrote:
> >> On Mon, Mar 31, 2008 at 2:40 PM, Joseph Brennan <[EMAIL PROTECTED]> wrote:
> >>>
> >>>  Jo Rhett <[EMAIL PROTECTED]> wrote:
> >>>
> >>> >  I would ask that you spend some time determining how the
> >>> > program could determine it is a bad rule, and provide a patch to fix 
> >>> > this
> >>> > behavior.  (in short -- it's harder than you think)
> >>>
> >>>  A mail delivery system that loses mail is buggy.  I don't need to look
> >>>  at the code to know that.
> >>>
> >>>  You can tell me no one has time to fix it, and in an open source project
> >>>  I can respect that.  But it is a bug.
> >>
> >> Look at this:
> >>
> >> If my script is
> >>
> >> redirect [EMAIL PROTECTED]
> >>
> >> I expect my mailbox to stay empty, because this is what redirect is
> >> supposed to do!
> >> If I found and email in my mailbox this is a BUG, because the script I 
> >> wrote
> >> should never let an email come in!
> >
> > I know, I know - pick me.  How about this one?
> >
> > discard;
> >
> >
> > It turns out that a mail delivery system that has been configured in a
> > way that loses mail has a bug _in_the_person_who_configured_it_.  Now
> > it may be that the language makes it easy to shoot yourself in the foot,
> > but that's different from being buggy.
> 
> Just for reference - we provide a web interface (custom, we wrote it)
> that 
> provides the features most people want to configure in their sieve rules 
> such as email forwarding, filtering based on From/To address, vacation 
> messages, and spam blocking.  Of course, they have no idea it is actually 
> sieve behind the scenes.  They just point and click the web interface.
> 
> This web interface has sanity checks to prevent people from doing silly 
> things like forwarding mail to themselves or the other common email 
> aliases on their accounts.
> 
> We also offer direct sieveshell access for users that ask if they can do 
> more than the web interface offers.  If these "smart" users shoot 
> themselves in the foot, oh well.

Sounds remarkably like what we have, except we don't provide a timsieved that
listens to the world - people have to paste their sieve scripts into a web
interface that does syntax tests before uploading it.  Mainly for proxying
reasons, we don't have anything set up that can proxy the sieve protocol,
and we don't allow direct connections to our backend servers.

Yeah, sieve is a weird language in some ways, but it mostly gets the job done
and it's the "cyrus way".  We could probably get much the same by delivering
to plus addresses from our perl lmtp proxy, but why re-design the wheel?

Bron.

-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Sync_client error Hit upload limit 0

2008-04-02 Thread Bron Gondwana
On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote:
> Dear List
> 
> I get the following type of error (see below) during replication that
> appeared after upgrading from 2.3.8 to 2.3.11.
> 
> This occurs occasionally and yet the emails are synced and it occurs for
> various user accounts.
> 
> I have noticed that this error only occurs when an account is getting a
> burst of emails or when several users are getting what seems to be the
> same spam email.
> 
> Is there a timing error?
> 
> I have a rolling replication set to a delay of 5 seconds - should I change
> the interval?

Wow, there shouldn't be a limit of 0.

unsigned max_count = config_getint(IMAPOPT_SYNC_BATCH_SIZE);
if (max_count <= 0) max_count = INT_MAX;

Well, that's somewhat bogus anyway, since it's unsigned.  May
as well be == 0.

But still - I can't see how it could become zero!

syslog(LOG_NOTICE,
   "Hit upload limit %d at UID %lu for %s, sending",
   max_count, index_list->last_uid, mailbox->name);

BAH - upload_messages_from() is broken.

Will reply shortly with a patch,

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Sync_client error Hit upload limit 0

2008-04-02 Thread Bron Gondwana
On Wed, Apr 02, 2008 at 09:51:50PM +1100, Bron Gondwana wrote:
> On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote:
> > I get the following type of error (see below) during replication that
> > appeared after upgrading from 2.3.8 to 2.3.11.
> 
> BAH - upload_messages_from() is broken.
> 
> Will reply shortly with a patch,

Ken - CCing you on this one since you'll want this for CVS.

I have compile tested this - haven't rolled it out anywhere,
but it's pretty trivial.

Regards,

Bron.
Index: cyrus-imapd-2.3.11/imap/sync_client.c
===
--- cyrus-imapd-2.3.11.orig/imap/sync_client.c	2008-04-02 10:56:52.0 +
+++ cyrus-imapd-2.3.11/imap/sync_client.c	2008-04-02 10:57:56.0 +
@@ -1358,7 +1358,7 @@ static int upload_messages_list(struct m
 struct sync_index_list *index_list;
 unsigned max_count = config_getint(IMAPOPT_SYNC_BATCH_SIZE);
 
-if (max_count <= 0) max_count = INT_MAX;
+if (!max_count) max_count = INT_MAX;
 
 if (chdir(mailbox->path)) {
 syslog(LOG_ERR, "Couldn't chdir to %s: %s",
@@ -1432,6 +1432,8 @@ static int upload_messages_from(struct m
 struct sync_index_list *index_list;
 unsigned max_count = config_getint(IMAPOPT_SYNC_BATCH_SIZE);
 
+if (!max_count) max_count = INT_MAX;
+
 if (chdir(mailbox->path)) {
 syslog(LOG_ERR, "Couldn't chdir to %s: %s",
mailbox->path, strerror(errno));

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: Sync_client error Hit upload limit 0

2008-04-02 Thread Bron Gondwana
On Wed, Apr 02, 2008 at 07:19:07AM -0400, Ken Murchison wrote:
> Bron Gondwana wrote:
>> On Wed, Apr 02, 2008 at 09:51:50PM +1100, Bron Gondwana wrote:
>>> On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote:
>>>> I get the following type of error (see below) during replication that
>>>> appeared after upgrading from 2.3.8 to 2.3.11.
>>> BAH - upload_messages_from() is broken.
>>>
>>> Will reply shortly with a patch,
>> Ken - CCing you on this one since you'll want this for CVS.
>> I have compile tested this - haven't rolled it out anywhere,
>> but it's pretty trivial.
>
> I understand the missing check for max_count's value, but is there a reason 
> why you're not checking that its negative?

unsigned

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Sync_client error Hit upload limit 0

2008-04-02 Thread Bron Gondwana
On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote:
> Apr  2 10:31:14 brooks sync_client[23049]: Hit upload limit 0 at UID
> 101775 for user.XXX, sending

Btw - you can get the same effect as applying the patch by putting 
the following in your imapd.conf:

sync_batch_size: 4294967295

Yeah, that magic number is (2^32 - 1) INT_MAX in other words.  This is
what the patch does anyway.

(or you can set it somewhere middling.  We set it to 1000)

Regards,

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Sync_client error Hit upload limit 0

2008-04-02 Thread Bron Gondwana
On Wed, Apr 02, 2008 at 07:46:33AM -0400, Ken Murchison wrote:
> Bron Gondwana wrote:
>> On Wed, Apr 02, 2008 at 07:19:07AM -0400, Ken Murchison wrote:
>>> Bron Gondwana wrote:
>>>> On Wed, Apr 02, 2008 at 09:51:50PM +1100, Bron Gondwana wrote:
>>>>> On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote:
>>>>>> I get the following type of error (see below) during replication that
>>>>>> appeared after upgrading from 2.3.8 to 2.3.11.
>>>>> BAH - upload_messages_from() is broken.
>>>>>
>>>>> Will reply shortly with a patch,
>>>> Ken - CCing you on this one since you'll want this for CVS.
>>>> I have compile tested this - haven't rolled it out anywhere,
>>>> but it's pretty trivial.
>>> I understand the missing check for max_count's value, but is there a 
>>> reason why you're not checking that its negative?
>> unsigned
>
> Ah, right.  Since config_getint() returns signed, we should probably make 
> max_count signed

Sounds reasonable - then I'm happy with <= 0 :)

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Sync_client error Hit upload limit 0

2008-04-02 Thread Bron Gondwana

On Thu, 3 Apr 2008 13:11:22 +1030 (CST), "Stephen Carr" <[EMAIL PROTECTED]> 
said:
> Dear Ken
> 
> I killed master and restated it but forgot to kill sync_client so it may
> have been running the unpatched version.

Sounds viable.  If you're starting sync_client from outside your cyrus
instance then you need to manage it with your init script as well.  We
actually do this with a wrapper tool, but the underlying concept is
there.

> I have manually restarted sync_client and after 30+ minutes had no
> "errors" and in that period I know one user got 8 emails at almost the
> same time.
> 
> It also seems the sync_client in 2.3.11 is more stable than in 2.3.8.
> 
> I have a cronjob to restart the sync_client if it "died" which usually
> happened 2 or 3 times a day, the new version has not had to be restarted
> except to get the patched version to run.

Yeah, I'm not surprised.  A bunch of us have spent a fair bit of time on
tracking down the bugs in replication and squashing them one-by-one!  We
(FastMail) only see bailouts when there's something really bogus like a
corrupt cache file (or when a replica gets shut down with the sync still
running of course).

Glad you seem sorted :)

Bron ( been off Cyrus patching for a while now... was time to take a
   break and work on something else )
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Patch to avoid mailboxesdb corruption on concurrent renames

2008-04-03 Thread Bron Gondwana
I've put a header on the patch describing the bug.  Basically,
the result code from mailbox_open_locked() wasn't being tested
sufficiently, and hence the new mailbox name would be created
in mailboxes.db even though the files were no longer available
to be copied - causing sync bailouts and IOERRORs and all sorts
of fun.

What causes this?  Clients that send a RENAME or DELETE event
and then you do something else in the GUI and they open a
_separate_ connection which tries to do something else to the
source folder.

I think it is our only remaining source of bailouts!  It
requires a reconstruct to fix when it happens.

Patch attached, and the perl script I used to confirm that it
existed and confirm that this patch fixed it.

Regards,

Bron ( P.S. isn't it about time for a 2.3.12?  I'm getting sick
   of posting skiplist patches to people running the
   lastest and having issues! )
#!/usr/bin/perl -w

use IO::Socket::INET;

$| = 1;

our $ADDR = '127.0.0.1:143';
our $USER = '[EMAIL PROTECTED]';
our $PASS = 'FIXME';

my $source = prepare_source();

my %h;
foreach my $n (1..3) {
  my $pid = fork();
  unless ($pid) {
my $res = do_rename($source, $n);
print "RES: $n => $res\n";
exit();
  }
  $h{$pid} = 1;
}
foreach my $pid (keys %h) {
  waitpid($pid, 0);
}

dump_folders();

sub prepare_source {
  my $sourcename = "INBOX.source$$";
  my $fh = IO::Socket::INET->new($ADDR);
  $fh->getline();
  $fh->print(". login $USER $PASS\r\n");
  $fh->getline();
  $fh->print(". delete $sourcename\r\n"); # just in case!
  $fh->getline();
  $fh->print(". create $sourcename\r\n");
  $fh->getline();
  print "$sourcename: ";
  foreach my $n (1..1) {
my $msg = gen_msg($n);
my $len = length($msg);
$fh->print(". append $sourcename {$len}\r\n");
$fh->print("$msg\r\n");
print "." unless $n % 50;
  }
  print " done\n";
  $fh->print(" . bye\r\n");
  $fh->getline();
  return $sourcename;
}

sub do_rename {
  my $sourcename = shift;
  my $n = shift;
  my $fh = IO::Socket::INET->new($ADDR);
  $fh->getline();
  $fh->print(". login $USER $PASS\r\n");
  $fh->getline();
  $fh->print(". delete $sourcename-dest$n\r\n"); # just in case!
  $fh->getline();
  $fh->print(". rename $sourcename $sourcename-dest$n\r\n");
  my $res = $fh->getline();
  $fh->print(" . bye\r\n");
  $fh->getline();
  return $res;
}

sub gen_msg {
  my $n = shift;

  return qq{Subject: Message $n
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>

This is a test message made long by shoving a whole pile of lines on it!

} . ("$n\n" x 10);

}

sub dump_folders {
  my $fh = IO::Socket::INET->new($ADDR);
  $fh->getline();
  $fh->print(". login $USER $PASS\r\n");
  $fh->getline();
  $fh->print("TAG1 list \"INBOX.*\" \"*\"\r\n");
  $fh->getline();
  while (my $res = $fh->getline()) {
last if $res =~ m{^TAG1 };
print $res;
  }
  $fh->print(". bye\r\n");
  $fh->getline();
}
Handle concurrent mailbox renames safely

*CANDIDATE FOR UPSTREAM*

About the last cause of sync bailouts at FastMail has been 
the case where a user starts renaming a folder, and then does
something else (including deletes with renames here, since we
have the deleterename option enabled)

This was caused by the mailbox_rename_copy function not
checking the return code of mailbox_open_locked properly
(probably due to it being a huge function with multiple
different styles of using the if (!r) error checking
paradigm in different places, but my can be bothered on
major refactors is low at the moment)

The side effect of this was that if the mailbox existed
going in (copy was still underway) then the old mailbox
would get deleted (noop, or corruption if you don't have
all the skiplist patches applied!) and the new mailbox name
created but with no files in place!

This patch checks the return code of mailbox_open_locked
and aborts with an IO error if the mailbox has already been
locked.
Index: cyrus-imapd-2.3.11/imap/mboxlist.c
===
--- cyrus-imapd-2.3.11.orig/imap/mboxlist.c	2008-04-04 00:59:36.0 +
+++ cyrus-imapd-2.3.11/imap/mboxlist.c	2008-04-04 02:47:46.0 +
@@ -1339,11 +1339,15 @@ int mboxlist_renamemailbox(char *oldname
 if(!r) {
 	r = mailbox_open_locked(oldname, oldpath, oldmpath, oldacl, auth_state,
 &oldmailbox, 0);
-	oldopen = 1;
+	if (r) {
+	goto done;
+	} else {
+	oldopen = 1;
+	}
 }
 
 /* 6. Copy mailbox */
-if (!r && !(mbtype & MBTYPE_REMOTE)) {
+if (!(mbtype & MBTYPE_REMOTE)) {
 	/* Rename the actual mailbox */
 	r = mailbox_rename_copy(&oldmailbox, newname, newpartition,
 NULL, NULL, &newmailbox,

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: Patch to avoid mailboxesdb corruption on concurrent renames

2008-04-04 Thread Bron Gondwana
On Fri, Apr 04, 2008 at 07:10:27AM -0400, Ken Murchison wrote:
> Bron Gondwana wrote:
>
>> Bron ( P.S. isn't it about time for a 2.3.12?  I'm getting sick
>>of posting skiplist patches to people running the
>>lastest and having issues! )
>
> Yes, it probably is.  Perhaps I'll make a pre-release today while I troll 
> bugzilla for any showstoppers.

That would be nice.  Also, I'll check it against our patch list and
see if there's anything bugfixy in there that I haven't pushed to
you yet.

By the way - would you prefer me to push things through Bugzilla?
So far I've just been posting patches to the mailing list, but I'm
happy to use whatever process is easiest for you.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: cyr_expire -E ?

2008-04-19 Thread Bron Gondwana
On Sat, Apr 19, 2008 at 09:19:40AM +0100, David Carter wrote:
> On Fri, 18 Apr 2008, David R Bosso wrote:
> 
> > I don't specify a -X, I just want to prune the duplicate db.  What am I
> > doing wrong?
> 
>   -X expunge-days
> Expunge  previously  deleted  messages  older  than expunge-days
> (when using the "delayed"  expunge  mode).   The  default  is  0
> (zero) days, which will expunge all previously deleted messages.
> 
> Try -X . cyr_expire is a bit overloaded.

That's something from upstream rather than from the FastMail patches.

I can see that it's unwanted behaviour, but by the same token I
accept the logic behind it:

a) don't break current installations (you can't require a -X
   parameter)
b) have as similar behaviour to the current as possible (never
   deleting mail by default would fill up people's drives with
   deleted spams, besides which the performance hit would suck
   over time, having those huge cyrus.expunge files sitting
   around)
c) there is no (c)
d) oh yeah, adding a "cyr_expire_expunge_default_keep_forever = yes"
   flag to cyrus.index is ugly and extra complex and not that much
   different from -X $INTMAX anyway.

Or something like that,

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: cyr_expire -E ?

2008-04-20 Thread Bron Gondwana

On Sun, 20 Apr 2008 07:26:54 -0400, "Steve Huston" <[EMAIL PROTECTED]> said:
> On 4/19/08 8:46 AM, Blake Hudson wrote:
> > I haven't looked at the source, but couldn't another flag be added that
> > would mimic the old behavior of only pruning the duplicate db? I would
> > assume that would be cleaner/faster than the proposed -X $INTMAX method
> > that would have to compare a bunch of timestamps...
> 
> Doesn't that exist in the form of "expunge_mode: immediate" in
> /etc/imapd.conf?  I never had an -X flag in my cyr_expire commandline
> until recently when I set expunge_mode to delayed and added the flag
> myself.

cyr_expire_expunge_mode: immediate

hmm - that looks viable to me!  Feel like testing it?  Basically it tells
cyr_expire that you have immediate expunge, so don't bother expunging, but
everything else will do delayed delete.

Don't come crying when your mailboxes melt under the load of huge spam
databases whenever you delete a message though... actually, it's not so
bad since cyrus_expunge doesn't get sorted every time, the records just
get appended.

Bron.

-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


FastMail.FM patchset updated against 2.3.12

2008-04-22 Thread Bron Gondwana
On Mon, Apr 21, 2008 at 07:35:01AM -0400, Ken Murchison wrote:
> I am pleased to announce the release of Cyrus IMAPd 2.3.12.  This
> release should be considered production quality.

http://cyrus.brong.fastmail.fm/

Notable changes:

* cyrus-findall-txn-2.3.12.diff
  
  Creates a mboxlist_findall_txn function which takes a transaction,
  allowing you to use it within a transaction.  This is a little
  "shutting the barn door" now that we have skiplist fixes for the
  same issue, but I think it's still valuable from a correctness
  point of view.

* cyrus-fastrename-2.3.12.diff

  Update the fastrename patch to use mobxlist_findall_txn when
  searching for inferiors and throughout all the rename and delete
  paths for mailboxes.

* cyrus-folder-limit-2.3.12.diff

  Basically just updates the folder limit patch to use the new API
  from the previous two patches, passing the transaction down.


Other than this, it's basically the same old patches refreshed
against 2.3.12, and of course everything that's been accepted
upstream is removed from our series now.


Testing status: I've built and run this on our testbed machine,
including a rename test harness, but it hasn't been run on
production machines yet.


Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus IMAPd 2.3.12 Released

2008-04-23 Thread Bron Gondwana
On Wed, 23 Apr 2008 10:33:27 +0200, "Thomas Robers - TuTech Innovation GmbH" 
<[EMAIL PROTECTED]> said:
> Ken Murchison schrieb:
> > I am pleased to announce the release of Cyrus IMAPd 2.3.12.  This
> > release should be considered production quality.
>
> I'm getting a segmentation fault when I try to start the master process.
> Compiling went fine on Debian etch (2.6.18-5-xen-amd64) with gcc version
> (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
>
> [...]
>
> open("/var/imap/imapd.conf", O_RDONLY)  = 3
> fstat(3, {st_mode=S_IFREG|0640, st_size=1366, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
> = 0x2b6c1cebe000
> read(3, "# (21.02.2008)\n## directori"..., 4096) = 1366
> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
> +++ killed by SIGSEGV +++
> Process 26096 detached

This would be even more useful:

% gdb /opt/imap/libexec/master
> run -d
[ wait for it to segfault ]
> bt

> Compiling and running version 2.3.11 with the same options on the same
> machine works fine.

Are you able to post your /var/imap/imapd.conf file?

I'm guessing there's something in there which is causing the segfault.

Regards,

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus IMAPd 2.3.12 Released

2008-04-23 Thread Bron Gondwana
On Wed, Apr 23, 2008 at 05:58:27PM +0400, Dmitriy Kirhlarov wrote:
> Sebastian Hagedorn wrote:
> > --On 23. April 2008 15:37:19 +0400 Dmitriy Kirhlarov <[EMAIL PROTECTED]> 
> > wrote:
> > 
> >> Attached patch add to log information about moving messages between
> >> folders. I am using this information from logs for relaunch dspam.
> >> Any chances for add this patch to project tree?
> > 
> > FWIW, logging this at LOG_ERR level certainly isn't the right way to do 
> > that ... I'd say it should be INFO at best, if not DEBUG.
> 
> And with this correction, patch can be included to cyrus imapd repo?

Have you looked at:

http://cyrus.brong.fastmail.fm/patches/cyrus-auditlog-2.3.11.diff

It's a very detailed logging system which logs all create, delete,
append, copy, expunge, unlink, etc events.  Anything which changes
a mailbox or message (but not metadata events like flag changes
at the moment).  It also logs noteworthy sieve events.

It logs everything at LOG_NOTICE.

If there are other users for it, I'm happy to put some effort into
making auditlog acceptable for upstream, and possibly generalising
it to allow logging of different classes of events.

We use it to populate a database which is linked with events from
the various login systems and the email tracking information from
our Postfix and lmtp setups - it's not finished yet, but the plan
is to be able to track the entire lifecycle of every message we
receive (probably only for a few weeks to save space!) so users
can see what's happening with their emails.

Regards,

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus IMAPd 2.3.12 Released

2008-04-23 Thread Bron Gondwana
On Wed, Apr 23, 2008 at 03:37:19PM +0400, Dmitriy Kirhlarov wrote:
> If dspam miss, user can manually move message from|to "spam" folder. This 
> fact fixed in cyrus log file. simple script parsing log and relaunch dspam.

> +syslog(LOG_ERR, "DSPAM-Hack index_copysetup(): %s -> %s, hdr %s", 
> mailbox->name,
> +copyargs->name, index_getheader(mailbox, msgno, 
> "X-DSPAM-Signature"));

Wow - that's pretty tricky.  I see you're actually logging a specific
header as well.  Funky.  We don't have anything like that in our
auditlog patch.

I'd still suggest that this should be a generic mechanism rather than
a hard-coded header if it's going in upstream.  Something like

auditlog_headers: X-DSPAM-Signature X-Something-Else

which would cause a log line:

auditlog: copy oldmailbox= mailbox= olduid=<1234> 
uid=<7> guid=<478920478932fabed74398943243> x_dspam_signature= 
x_something_else=

Of course there would need to be quoting support since these headers
could contain <>.  Oh the humanity.  My favourite is URI encoding
because it's really quite simple to parse, but I'm sure everyone has
their favourite.

What do you think?

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus - GFS slow start and poor performace

2008-05-14 Thread Bron Gondwana

On Wed, 14 May 2008 10:59:33 +0200, "Maurizio Lo Bosco" <[EMAIL PROTECTED]> 
said:
> Hi all,
> I know that the usage of the GFS has been discussed for long time on this 
> mailing list but I would like to know if it is normal to have a very slow 
> start (15 minutes) with just 4300 users (the cyrus db is composed of
> 20940 
> lines).
> It happens only with the GFS and the skiplist database; using the flat it 
> takes few seconds to start.
> The system is composed of 2 IBM x3550 with redhat enterprise linux 5.1 
> attached to a SAN IBM DS4700 with an dual fibre channel (4Gb/s multipath 
> active-backup).
> 
> The dump of the database takes 7 minutes but the disk usage is definitely
> low 
> (less than 5%)
> RedHat is saying that there is no way to optimise the performance on the
> GFS 
> locking architecture and they will now take a look to the cyrus code. 
> 
> Do you have any tips?

Skiplist mailboxes.db gets a "recovery" run on it at startup.  The recovery
visits all the records in the file.

That said, it does it all in order.

Can you post the syslog output of cyrus as it starts (slowly)?  I wonder if
it's also doing a checkpoint, which visits all the mailboxes.db records as
well, but does them in...


oh, indeed.

Recovery also writes back pointers all over the place.  It does LOTS of writes
to random locations within the file.  If GFS is doing something insane like
writing back the entire file to the server for every single update (generally
4 bytes at a time) then this could be a big problem!

That said, the file is locked with a fcntl (flock if fcntl isn't available) lock
over the entire file +append space.  This is an exclusive lock, and it's held
for the entire recovery run.  If GFS's locking can say "just do the updates and
save copying back until the fsync at the end" then that should speed it up.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus - GFS slow start and poor performace

2008-05-14 Thread Bron Gondwana
[reply number 2 - addressing bits I missed in the first reply...]

On Wed, 14 May 2008 10:59:33 +0200, "Maurizio Lo Bosco" <[EMAIL PROTECTED]> 
said:
> The dump of the database takes 7 minutes but the disk usage is definitely
> low 
> (less than 5%)

A dump of the database visits all records in alphabetical order.  This can 
result
in somewhat random looking seeks around the file due to the layout of a 
skiplist,
but it will happen within the mmap.

> RedHat is saying that there is no way to optimise the performance on the
> GFS 
> locking architecture and they will now take a look to the cyrus code. 

You may want to pass on the RedHat engineers that Cyrus uses an MMAP of the 
entire
file to read all records, and uses seeks and direct writes the same fd (or a 
different
fd depending on compilation settings) to write.  Skiplist appends entire 
records to
the file, but also seeks back and updates pointers (4 byte records) within the 
file
with each update.

That's writing.

Reading - it reads each record, gets a pointer to the location of the next one, 
and
reads from the memory location that corresponds to db->map_base + 
pointer_offset.


Depending on your requirements, it may make sense to place your mailboxes.db on 
local
disk (it's pretty small) and regularly copy/rsync it onto your GFS partition.  
Worst
case you lose a couple of mailboxes.db records in a crash.  Depends what you 
can afford
to lose.  You could probably stat the file every second and copy it on any 
change pretty
cheaply and risk losing at most the last second's changes (it doesn't change 
often)

Regards,

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: sync_server crashing..

2008-05-17 Thread Bron Gondwana

On Fri, 16 May 2008 19:18:04 + (GMT), "Andy Fiddaman" <[EMAIL PROTECTED]> 
said:
> 
> Hi,
> 
> I'm hoping that someone who is familiar with the replication code can
> help me with a problem I'm seeing with Cyrus 2.3.12.
> 
> I have a two server replicated setup and sync_server is occasionally
> crashing. Once it's crashed once it keeps on crashing until I completely
> reset replication by snapshotting the master, removing the sync logs
> and rsyncing the snapshot to the replica.
> 
> The crash is happening in sync_cacheitem_size()
> 
> Core was generated by `sync_server'.
> Program terminated with signal 11, Segmentation fault.
>
> [...]
>
> This last itemlen pushes the pointer out of the allocated memory and
> causes the crash.
> 
> Any ideas on whether these entries look right and where I should look
> next
> to debug it?

You have a corrupted cache file.  Various things could have caused this,
it isn´t easy to know what it was.

Your fix works because once you rsync, there is no mention of the folder
with the problem in the sync log any more, however next time anything
happens on that folder you get the crash again.

1) figure out what folder it is
2) reconstruct it
3) profit???

Enjoy,

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: sync_server crashing..

2008-05-18 Thread Bron Gondwana

On Sun, 18 May 2008 21:44:45 + (GMT), "Andy Fiddaman" <[EMAIL PROTECTED]> 
said:
> On Sun, 18 May 2008, Bron Gondwana wrote:
> 
> ; You have a corrupted cache file.  Various things could have caused
> this,
> ; it isn´t easy to know what it was.
> 
> Thanks.
> 
> I can't find the corrupted mailbox so I've run reconstruct on everything,
> rsyncd the master to replica again and I'll see if the problem recurs
> (you
> can tell it isn't a massive mailstore!)
> 
> I tried to find the mailbox with the problem by writing a quick program
> to
> scan through each cache file and it didn't detect any errors. I also ran
> mbexamine on every mailbox with no problems so I don't know where the
> corruption, if any, was.
> 
> Keeping my fingers crossed anyway,

Hmm.. by corrupted cache file it could actually be the cache base pointers
from the cyrus.index that are corrupted.  One cause was delayed expunge
and reconstruct, but David Carter wrote some patches which got into 2.3.12
to fix that, so new reconstructs will be fine.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: are these error messages severe? and how to fix them?

2008-05-19 Thread Bron Gondwana

On Mon, 19 May 2008 21:07:55 +0200 (CEST), "Simon Matter" <[EMAIL PROTECTED]> 
said:
> > On Mon, May 19, 2008 at 6:31 PM,  <[EMAIL PROTECTED]> wrote:
> >> greetings all.
> >>
> >> This morning a user called me saying that he was using reading his email
> >> (via squirrelmail) in one computer, then he logged out, and some time
> >> later
> >> went to another computer open squirrelmail, and his mail was gone
> >>
> >> I checked directly in the mailstore and he had only a couple of
> >> messages,
> >> but he assures me that he did not delete anything
> >>
> >> After some search on the log files, I found something like this:
> >>
> >> May 19 09:56:05 ccaix imaps[8619]: skiplist: recovered
> >> /var/lib/imap/user/C/user^name.seen (3 records, 7316 bytes) in 0 seconds
> 
> I think skiplist files are always "recovered" when they are opened. So
> that is not a sign of anything wrong.

Yeah, all that means is that the timestamp of the skiplist file is earlier
than the timestamp of the last time cyrus was started.  A "recovery" just
goes through the file and makes sure that all the pointers are correct.

That message is harmless.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus POP access restritcion to users

2008-05-20 Thread Bron Gondwana
On Tue, May 20, 2008 at 08:57:14PM +0530, Ashay Chitnis wrote:
> Hi,
> 
> I wanted to know if we can restrict some users to access POP and allow some
> users to access POP. I do not want to have firewall based restriction.  I am
> using cyrus-imapd-2.3.7-4. The same users should be allowed through Webmail
> without any issue. The users are LDAP users.  Can anyone help me on this?

We do exactly this at FastMail, but we use a different approach.  All
user connections are via an nginx proxy, and the authentication daemon
used by nginx will return an error if the user tries to log in via POP.
It will also send them an email explaining the policy and offering them
an option to upgrade to an account level that does support POP...

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Protection against POP or IMAP Denial of Service (DOS)

2008-05-20 Thread Bron Gondwana
On Wed, May 21, 2008 at 12:32:33AM +0200, Stéphane BERTHELOT wrote:
> But being recently attacked many times especially on POP3 service I am 
> looking for some advice or maybe making a feature request for some more 
> protection against DOS.

Gosh, I seem to be spending a lot of time pimping nginx here!  We get
protection against this sort of DOS for free (as well as load balancing
and etc) by having frontend servers running nginx as a proxy.

Nginx is compiled (on our 2.6.x kernels) with epoll support, so it can
handle bazillions of connections with the 8 processes it's configured to
use.  It also handles SSL (so the backend IMAP machines don't need to)
and deals with the connection up until the point where the user is
authenticated, at which stage it performs a login on the backend server
and links the connection through.

Compared to Perdition which was one-process-per-connection, this has
scaled amazingly well.  One medium spec machine can easily handle
(checks) about 7000 connections at the moment, and it scales to a lot
more than that during the US day.  That's with HTTP proxy, authenticated
SMTP injection, ftp server, lots of other things - and the frontend
machine is still barely using one of the 4 processor cores in it.

You could easily put nginx on your IMAP server directly if you didn't
want to dedicate a second machine to the job, and it would handle the
DOS risk for you.

I like this approach from a UNIX design perspective.  One service that
is designed for coping with DOS attacks and talking to the outside
world, and a separate service that is designed purely for actually
providing the service, rather than complicating it with DOS accounting
and tracking mechanisms.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Protection against POP or IMAP Denial of Service (DOS)

2008-05-21 Thread Bron Gondwana

On Wed, 21 May 2008 07:13:10 +0200, "Christiaan den Besten" <[EMAIL PROTECTED]> 
said:
> Bron,
> 
> What does the authentication for nginx for you, since it can't query  
> for example a ldap directly ( at least, not the last time I checked )?  
> The epoll will scale, but wondering what is the most 'light' method to  
> do the actual authentication ..

Perl, it's the swiss cheese^H^H^H^H^H^Harmy knife of tools.

Specifically, we have this funky little thing that's increasingly
inaccurately named "saslperld".  It's just forking Net::Server
derivative that listens to unix sockets.  It currently talks the
following protocols:

* lookup
* mux
* nginx
* perdimap
* perdpop
* vfs

Ok - so we don't use either of the perdition ones any more, they should
probably get removed in the cleanup I'm planning to do later this week
(while working on one time password, openid, other goodies).

"lookup" is a simple key value protocol allowing usernames to be resolved
to our internal userids.  It's used by log analysis tools.

"mux" is the saslauthd protocol.  Some sort of packed struct format from memory.

"nginx" is the nginx http authentication protocol

"vfs" is also very badly named.  It's the protocol that I originally wrote for
handling our vfs interfaces (DAV & FTP) but has since expanded to be used by
our web interface and every other bit of code that wants to check user
authentication details, because the protocol is so easy to use from our
perl libraries.

The overhead of unix sockets really is very low, and being separate processes
means any epoll thingy (looking a DJabberd soon hopefully) can chat to it
asynchronously without having to do its own thread pool.

It also listens on a UDP port for broadcast cache expiry events and caches user
details to reduce database traffic for protocols with frequent short-lived 
logins.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Status of Cyrus replication

2008-05-23 Thread Bron Gondwana
On Thu, May 22, 2008 at 02:17:02PM -0500, Blake Hudson wrote:
> Hey all, last time I checked replication was undergoing major overhauls 
> and incompatibility between minor versions of 2.3.x was pretty great. 
> There were also a few bugs that could potentially cause trouble down the 
> road. I've had the need to create setups with failover servers and have 
> continued using rsync on an interval (~30 to 60 min) for this purpose. 
> Unfortunately this causes quite a lot of IO load on the servers and I 
> was hoping that a rolling replication setup would help resolve this.

Yeah, it would!  Are you using rsync 3.0?  It doesn't help with the IO
load, but at least it's a bit more incremental about things.

Also, you can get huge performance wins with a tiny bit of custom code,
something like this hunk of untested perl:

while (readdir(DH)) {
  if (m/^cyrus\./) {
# rsync this file, could have changed arbitrarily
  }
  elsif (m/^\d+\.$/) {
# this is a cyrus message file, if it exists on the replica then
# no need to try and sync
  }
  elsif (! m/^\./) {
# this is a subfolder, sync it.
  }
}

Basically, you don't need to stat the message files, which are the
bulk of your data.

... but that's still a lot of custom protocol development and stuff.
Annoying.

> What's the status of Cyrus replication in the latest releases of 2.3.x - 
> specifically with virtual domains enabled?

It's getting pretty good actually.  Most of our replication errors
for the last couple of weeks have been traced back to a bug in our
automated user-move code, which meant it failed to add a "USER $foo"
to the sync log after moving users to new servers - so moved users
who had no activity were not replicated.
 
> It also seems like there have been some problems with the latest 
> releases of 2.3 and I'm hesitant to upgrade my 99% working 2.3.1 
> install. Any lingering issues or reason not to upgrade?

There were some bad times in there.  The only outstanding bug I'm
aware of in 2.3.12 is the blank lines in config file segfault -
you'll either see that straight away or not at all!

> For those who have the need to create a "hot spare" server and are not 
> using Cyrus replication, what method are you guys using to accomplish 
> this goal?

Our backup system (not quite the same!) uses a perl module which
reads the folder records from mailboxes.db and then uses fcntl locks
on the cyrus.* files in each folder to block out cyrus while it
streams the cyrus.* files.  These are then backed up, and also
parsed to see what message files are indexed - this is compared
against what has already been fetched, and any new messages are
also fetched and stored.  It's blindingly quick through intimate
knowledge of Cyrus's internals.

In the best case, no matter how big the folder, it costs only two
stats (cyrus.header and cyrus.index, we don't bother backing up
cyrus.cache since it's all derived information).  If either of
them has changed we stream the contents of them both.  Only then
if there are new message files do we cause any IO on the data
partition, and that is direct filename opens.  No readdirs ever.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: seen db

2008-06-10 Thread Bron Gondwana
On Tue, Jun 10, 2008 at 12:02:44PM +0200, Rudy Gevaert wrote:
> Hi,
> 
> I'm seeing this in my logs
> 
> mail5r/syncserver[19755]: seen_db: user [EMAIL PROTECTED] opened 
> /mail/mail5r/var/imap/domain/u/ugent.be/user/n/nick^andries.seen
> mail5r/master[12683]: process 19755 exited, signaled to death by 11
> mail5r/master[12683]: service syncserver pid 19755 in BUSY state: 
> terminated abnormally
> 
> Deleting the seen file on the replica, or reconstructing doesn't help.
> I need to delete the mailbox on the replica and resync it.
> 
> It's only for certain users, so I don't think it has to do with my 
> upgrade from sarge to etch.  (I brought down my lun on sarge machine, 
> and brought it up on the etch machine).  I'm running 2.3.12p2 on sarge 
> and etch.
> 
> An other downside is that my replication hangs on that user. 
> sync_client bails out, and restarts but with that user...  So he keeps 
> retrying.
> 
> I would appreciate further help in debugging the problem.

Are you running a 64 bit kernel?

(just wondering - we have hit pretty much the same issue - and were
wondering about dodgy kernel issues being a proble - it's only one
machine that seems to have corrupted seen files, only on replicas)

We've been running 2.3.12 for about a week, and it's only last night
that we had anything funny show up at all.

Interestingly, it's probably the first time cyr_expire ran on 2.3.12
just before that - and also the first time our check-replication
script was running, which loads a lot of seen files on BOTH ends.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: seen db

2008-06-10 Thread Bron Gondwana

On Tue, 10 Jun 2008 15:29:01 +0200, "Rudy Gevaert" <[EMAIL PROTECTED]> said:
> Bron Gondwana wrote:
> 
> > Are you running a 64 bit kernel?
> 
> Yes, but the system is 32bit (I run 64bit kernel  + 32 emulation support)

Interesting, so do we (on etch as well)

> > (just wondering - we have hit pretty much the same issue - and were
> > wondering about dodgy kernel issues being a proble - it's only one
> > machine that seems to have corrupted seen files, only on replicas)
> 
> 
> > We've been running 2.3.12 for about a week, and it's only last night
> > that we had anything funny show up at all.
> > 
> > Interestingly, it's probably the first time cyr_expire ran on 2.3.12
> > just before that - and also the first time our check-replication
> > script was running, which loads a lot of seen files on BOTH ends.
> 
> Here cyr_expire has been running on 2.3.12 for a couple of weeks.  But 
> here the first time too with the 64bit kernel.

There you go.  We've had the 64bit kernel approximately forever, but only
just upgraded from 2.6.20 series to 2.6.25.

> I can try with a 32bit kernel tomorrow.
> 
> In attachment a strace to show where it segfaults

Almost certainly boring, since it's file corruption.  The file itself would
be significantly more interesting.  My guess - you'll be finding little blocks
of (small n)*4 bytes which happen to be NULL.  It's when they intersect with
the pointers table that things get interesting.

Oh - can you tell me.  Did the file checkpoint sometime not too long before it
got corrupted?

I've got a small set of theories, but I'm reading the skiplist source code
(again!) to see if they make sense...

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: seen db

2008-06-11 Thread Bron Gondwana
On Wed, Jun 11, 2008 at 10:52:31AM +0200, Rudy Gevaert wrote:
> Bron Gondwana wrote:
>> There you go.  We've had the 64bit kernel approximately forever, but only
>> just upgraded from 2.6.20 series to 2.6.25.
>>
>>> I can try with a 32bit kernel tomorrow.
>
> Unfortunate with the 32bit kernel 2.6.24-2 it sync_server still segfaults.

Try a 2.6.20 kernel, just for an interesting datapoint.  We changed
back to 2.6.20 (64 bit still) and haven't seen a corrupted seen file
since.

>> Oh - can you tell me.  Did the file checkpoint sometime not too long before 
>> it
>> got corrupted?
>
> The cases I saw it did.

Ditto here.  Interesting.  They also had quite long records, but
I don't know how common that is.  Lots of little bits of seen
spread around the space.

>> I've got a small set of theories, but I'm reading the skiplist source code
>> (again!) to see if they make sense...
>>
>> Bron.
>
> I'm also wondering if what would happen if I brought up a master. Surely 
> the imap processes would also segfault.  Right?

If it was on those corrupted files, yes.  On that machine - quite
probably.  If you can afford the hardware it may be worth testing.

(hmm, I can possibly dedicate a 64 bit capable machine to testing
this.  If it's a kernel bug I'd love to reproduce it)

> Here I can delete the mailbox on the replica and sync again.  As a  
> reconstruct doesn't help.

We find reconstructing helps now - but that's with the 2.6.20
kernel.  There were multiple things going wrong before.  We
originally suspected the external drive unit was playing up,
but I'm thinking kernel now.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: seen db

2008-06-11 Thread Bron Gondwana

On Wed, 11 Jun 2008 15:07:02 +0200, "Rudy Gevaert" <[EMAIL PROTECTED]> said:
> Bron Gondwana wrote:
> 
> > Try a 2.6.20 kernel, just for an interesting datapoint.  We changed
> > back to 2.6.20 (64 bit still) and haven't seen a corrupted seen file
> > since.
> 
> I hope to try that still today.
> 
> I'm now running on 2.6.24-2, 32bit.  I have cleaned up the users that 
> were having a corrupted mailbox on replica.  Surprisingly I can count 
> them on both hands.
> 
> So now I'm again running with rolling replication and I'm doing a 
> sync_client session for each user.  When that is finnished I'll try to 
> downgrade the kernel.
> 
> Btw, I tested my sarge-> etch upgrade in a xen virtual machine, 64bit 
> kernel + 32 bit userspace.  But this was 2.6.18.
> 
> I'm still wondering if I should run 2.6.20 in 32bit or 64bit...

It's been fine for us as 64bit for a while now.

Though note - 64bit will allow lots more process space, which allows
broken cache files to REALLY SCREW WITH YOU.  Bah.  We have 4Gb core
dumps being written into our cores directory - and let me tell you,
while something is dumping core it uses some trick which totally
nukes all other IO on the same device.  It gets ioniced up there really
happy.  Ouch.

The cause - mailbox_cache_size hits a bogus "length" field and returns
like 1.7Gb as the size of the record.  This then causes an xrealloc to
"size * 2", or 3.4Gb.

At least in the case of one  mailbox that's been causing us fun.  In
a second I'll gdb that awfully large core and figure out which mailbox
is the culprit.  One reconstruct later

> >>> Oh - can you tell me.  Did the file checkpoint sometime not too long 
> >>> before it
> >>> got corrupted?
> >> The cases I saw it did.
> > 
> > Ditto here.  Interesting.  They also had quite long records, but
> > I don't know how common that is.  Lots of little bits of seen
> > spread around the space.
> 
> I'm not sure how I would see that?  I'm not familiar with the internals 
> of skiplist.

I find they show up pretty well as [EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL 
PROTECTED]@ in less.  The skiplist format
doesn't have many all zero blocks otherwise.  Lots of other special characters
show up for binary bits.

Sadly, I can pretty much read a hexdump of a skiplist.  Sad because that's a
lot of braincells that could be doing something useful like absorbing alcohol.


I've written a little patch for the mailbox_cache_size issue that returns 0
if the result ever looks like it's going negative or more than 100 million
bytes.  Then sync_support is patched to treat a zero cache size as "say we
failed to reserve this message".  It will do for now...

Bron ( also found a theoretical bug in the skiplist code and patched it today,
   but I might fix the whole function before I submit it upstream.
   I say theoretical because I don't see that the codepath gets exercised
   unless you already have a corrupt file, so meh )
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Is skiplist dependant on byte order?

2008-06-16 Thread Bron Gondwana
gt; > far as I can tell, the annotation_db, duplicate_db, and tlscache_db
> > are empty and can simply be removed.  Are there any others on a murder
> > front end that I've missed?  Where do they reside?

Yeah, we nuke all those on restart.  duplicate_db is the most interesting
of that lot - but not a giant concern.  It will cause vacation messages to
be repeated, and duplicate messageids to be delivered if it's gone - that's
about all.  For a once-off I wouldn't be at all concerned.

mailboxes.db really is the big one.  Anything else with berkeley named in it
that's either in your imapd.conf or defaulted that way in lib/imapoptions.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Linux kernel bug AMD64 - affects skiplists

2008-06-17 Thread Bron Gondwana
I promised I'd have something to say about skiplists soon!

(hi Rudy - hope you had a good time off, leaving me here to
figure this out _all_by_myself_ ;) )

There's a bug in the linux kernel for amd64 builds only
that breaks some skiplist files.

Specifically, checkpointing a seen file with a long (greater
than page size) list of seen data will cause corruption where
it crosses the page break.  The last 16-24 bytes will of the
page will be NULLed out.

You can read more about it in all its gory detail here:

http://lkml.org/lkml/2008/6/17/9

Thanks Linus for the prompt (at least partial) fix.

If you are running one of those kernels now, I recommend you
either change the kernel version, or apply the patch Linus
posted.  I was going to suggest a little "magic" patch, but
I've been unable to actually make it work in testing, so I
won't do it!

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Linux kernel bug AMD64 - affects skiplists

2008-06-18 Thread Bron Gondwana

On Wed, 18 Jun 2008 23:04:49 +0200, "<::Teresa_II::>" <[EMAIL PROTECTED]> said:
> У ср, 2008-06-18 у 14:00 +1000, Bron Gondwana 
> пише:
> > I promised I'd have something to say about skiplists soon!
> 
> My cyrus runs on amd64 too, so does
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=42a886af728c089df8da1b0017b0e7e6c81b5335
> 
> fix the problem ?

Yes, it does.  I haven't rolled it out to any production machines yet (just 
reverted
back to the 2.6.20 series kernel that we were using before) - but I built a test
kernel with it and confirmed the fix.

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Skiplist errors on Cyrus 2.3.12

2008-07-11 Thread Bron Gondwana
On Fri, Jul 11, 2008 at 11:37:52AM +0200, Reinhard Zierke wrote:
> Hi,
> 
> Since I upgraded my Solaris 10 Postfix/Cyrus mail server to Cyrus 2.3.12,
> I habe some problems when removing mailboxes.  I try to delete
> several users' mailboxes within one call of cyradm or in a homemade Perl
> script that uses Cyrus::IMAP::Admin.  The first delete in one command
> invocation always work fine, but from the second user on I get a terse
> error message "cyrus: c:" and the syslog shows something like:
> 
> Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 637875 
> local6.error] Fatal error: Internal error: assertion failed: 
> cyrusdb_skiplist.c: 622: db->lock_status == UNLOCKED
> Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 558109 
> local6.error] skiplist: closed while still locked
> 
> Also, I see tons of skiplist 'already open' messages in my syslog like:
> 
> Jul 11 09:59:33 mailhost.informatik.uni-hamburg.de imaps[9792]: [ID 412576 
> local6.notice] skiplist: /var/imap/user/z/zierke.seen is already open 1 time, 
> returning object
> 
> What's wrong?  And can I safely go back to Cyrus 2.3.11 binaries without
> botching up my Cyrus databases?

That would be the skiplist sanity checks finding a latent bug in another
part of Cyrus.

Are you able to send me the exact sequence of commands your script runs?
Is there anything else between the deletes?  I'll go have a look at the
source code.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Skiplist errors on Cyrus 2.3.12

2008-07-11 Thread Bron Gondwana
On Fri, Jul 11, 2008 at 11:37:52AM +0200, Reinhard Zierke wrote:
> Hi,
> 
> Since I upgraded my Solaris 10 Postfix/Cyrus mail server to Cyrus 2.3.12,
> I habe some problems when removing mailboxes.  I try to delete
> several users' mailboxes within one call of cyradm or in a homemade Perl
> script that uses Cyrus::IMAP::Admin.  The first delete in one command
> invocation always work fine, but from the second user on I get a terse
> error message "cyrus: c:" and the syslog shows something like:
> 
> Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 637875 
> local6.error] Fatal error: Internal error: assertion failed: 
> cyrusdb_skiplist.c: 622: db->lock_status == UNLOCKED
> Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 558109 
> local6.error] skiplist: closed while still locked
> 
> Also, I see tons of skiplist 'already open' messages in my syslog like:
> 
> Jul 11 09:59:33 mailhost.informatik.uni-hamburg.de imaps[9792]: [ID 412576 
> local6.notice] skiplist: /var/imap/user/z/zierke.seen is already open 1 time, 
> returning object
> 
> What's wrong?  And can I safely go back to Cyrus 2.3.11 binaries without
> botching up my Cyrus databases?

Oh yeah, a copy of your imapd.conf and whether you apply any patches
would be nice to know too.  Handy for reconstructing the problem!
I apply two patches to skiplists on 2.3.12 (yeah, I know, after all
the stuff I've had pushed upstream too):

hmm... and I realise I haven't updated the website for a while.
Doing that now... ok:

http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-safeunlock-2.3.12.diff
http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-readlocktracking-2.3.12.diff

I'd be interested to know if the issue still exists with these.
They tidy up the logic for locks even more.  I needed it to make
the fast_rename and folder_limit stuff work again.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Auto-deletion of messages in Junk-folder after a certain time

2008-07-15 Thread Bron Gondwana
On Mon, Jul 14, 2008 at 01:54:01PM +0200, Marten Lehmann wrote:
> Hello,
> 
> we have a virtual domain configuration and I want to remove all messages 
> within the folder
> 
> user/@/Junk/*

Being the filty perl programmer that I am, I would just make an admin
IMAP connection to the server, LIST all mailboxes, regex match the ones
I wanted, select them and process them.

> I don't want to mark old messages as deleted and expunge them, because 
> then maybe I'm expunging messages, that haven't been flagged as deleted 
> by me but the owner of the mailbox and aren't ment to be expunged at 
> this moment.

We do this by setting our own special flag (in addition to the regular
\Deleted flag) and then SEARCH for those messages only and UIDEXPUNGE
them.

But if you're deleting ALL messages, then it doesn't really matter does
it?  Unless you're doing some sort of age based thing, in which case
like I said - UIDEXPUNGE.  The flag just lets us persist the action
across dropped connections.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: patches since 2.3.12-p2?

2008-07-19 Thread Bron Gondwana
On Fri, Jul 18, 2008 at 11:57:28AM +0200, Per olof Ljungmark wrote:
> and
> skiplist: unlock while not locked

This is almost certainly a bug.  I added this along with a bunch
of other skiplist changes to find places where the database
interface wasn't being used correctly, because it means bugs of
some sort.

There's another skiplist bug I've been trying to track down
(multiple deletes on the same connection failing), but haven't
been able to reproduce it yet.

Unfortunately, the cyrus database interface sort of sucks from
a consistency perspective, it's dangerous to call any function
that needs database access if you have a database transaction
open, because the code doesn't know about the transaction and
blindy goes ahead and starts a new transaction, which doesn't
work.

The code now throws an error immediately rather than causing
corruption.  Much better :)

Bron ( ok, so far I've only seen this happen in my own bogus
   patches, but it's still better to be safe! )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: patches since 2.3.12-p2?

2008-07-21 Thread Bron Gondwana
On Mon, Jul 21, 2008 at 12:17:14PM +0100, Ian G Batten wrote:
>
> On 19 Jul 08, at 1313, Bron Gondwana wrote:
>>
>> Unfortunately, the cyrus database interface sort of sucks from
>> a consistency perspective, it's dangerous to call any function
>> that needs database access if you have a database transaction
>> open
>
> I understand some of the technical, philosophical and historical reasons 
> why this isn't the case, but every now and again I find myself wishing 
> that Cyrus had an SQL backend for the various databases (perhaps not 
> delivery, because losing it isn't the end of the world, but certainly for 
> mailboxes).
>
> In our case, we have really big Oracle and Postgres systems that could  
> proably handle the load imposed by out mailsystem metadata as well as  
> our mailsystem copes with it itself via skiplist, but we would could  
> then manage those databases with the same tools we use for the  
> production systems (hot backups, replication, etc).
>
> Losing the mailboxes database can spoil your whole day, and the lengths 
> we go to to keep it safe (snapshots of the filesystem, hourly runs of 
> ctl_mboxlist -d, etc, etc should really be necessary if it were in a 
> production SQL database.
>
> In my copious spare time, I might take a pass at the cope and see how  
> hard it looks.

Muahahahahaha.

Erhum.

Actually, the interface itself isn't that bad.

Managing transactions might give you headaches though.

And connections would probably be per-imapd process, so be prepared to
have 4k connections sitting mostly idle or lots of startup/shutdown of
connections.


Bron ( not having done any real C library SQL coding myself, I'd suggest
   probably some sort of generic DBI-style layer than a single
   database at a time if you go this route )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: help with cyr_expire please

2008-07-27 Thread Bron Gondwana
On Sun, Jul 27, 2008 at 11:18:43AM +0200, Per olof Ljungmark wrote:
> It seems I do not fully understand how cyr_expire works. Some background:
> 
> I've set up a testing server with 2.3.12-p2 with delayed expunge and 
> delayed delete. In cyrus.conf:
> delpruneandexpunge  cmd="cyr_expire -D 7 -E 3 -X 7" at=0400
> 
> and it all worked as expected.
> 
> Now I'm ready to implement this on our production (2.3.12-p2) machine, 
> added the proper statements to imapd.conf
> expunge_mode: delayed
> delete_mode: delayed
> deletedprefix: DELETED
> 
> Then i tested the following command:
> su cyrus -c "/usr/local/cyrus/bin/cyr_expire -D 5 -E 5 -X 5 -p 
> user.spamdump -v"
> 
> and to my horror it did not only delete expunged messages but a fair 
> share of messages without any flags set:
> expunged 677607 out of 682456 messages from 30 mailboxes

Did you check if any "live" messages were actually deleted, or was it
just expunged messages cleaned up?  I can imagine that being the case
for a spamdump user that you clean up frequently.

Bron.



Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: even more questions on replication and expire

2008-07-29 Thread Bron Gondwana
On Tue, Jul 29, 2008 at 12:16:01PM +0200, Rudy Gevaert wrote:
> Per olof Ljungmark wrote:
> > In the course of setting up delayed expunge on our production server I 
> > came across the following;
> > 
> > - With delayed_expunge on the master, messages that are expunged by a 
> > user will be retained -X days on the master but immideately deleted on 
> > the replica unless it has delayed_expunge too.
> > 
> > So if I implement delayed_expunge on the replica, do I need cyr_expire 
> > to permanently remove messages after -X days or will sync_client do 
> > that?
> 
> yes

That's "yes" to "you need to run cyr_expire on the replica too".

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: even more questions on replication and expire

2008-07-29 Thread Bron Gondwana
On Tue, Jul 29, 2008 at 05:09:05PM +0200, Per olof Ljungmark wrote:
> Bron Gondwana wrote:
>> On Tue, Jul 29, 2008 at 12:16:01PM +0200, Rudy Gevaert wrote:
>>> Per olof Ljungmark wrote:
>>>> In the course of setting up delayed expunge on our production 
>>>> server I came across the following;
>>>>
>>>> - With delayed_expunge on the master, messages that are expunged by 
>>>> a user will be retained -X days on the master but immideately 
>>>> deleted on the replica unless it has delayed_expunge too.
>>>>
>>>> So if I implement delayed_expunge on the replica, do I need 
>>>> cyr_expire to permanently remove messages after -X days or will 
>>>> sync_client do that?
>>> yes
>>
>> That's "yes" to "you need to run cyr_expire on the replica too".
>
> Thanks for the info. I can't help wonder if this was a firm design  
> decision? From a user perspective it should be easier if this followed  
> the synchronization I believe.
>
> Anyway, thanks, that was the last piece needed to finish off.

I would much prefer that it was done via synchronisation as well.  It's
a pain from a consistency point of view.

But there you go.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Cyrus vs Dovecot

2008-08-13 Thread Bron Gondwana
On Wed, Aug 13, 2008 at 01:07:34PM -0400, Wesley Craig wrote:
> On 13 Aug 2008, at 10:31, kbajwa wrote:
> > I think you are missing a point which is most important, i.e., what  
> > type of
> > support Cyrus vs Dovecot offers. In my experience:
> >
> > Cyrus  =  0
> > Dovecot=  100
> 
> As someone who answers many help requests for cyrus (and I'm very far  
> from the only one), I can honestly say I've never seen a requests  
> from you.  Perhaps you've had a lot of occasion to ask for help with  
> Dovecot.  I'm happy to hear you've gotten that help.  Community is a  
> lot of what open source software is about.  As for your experience  
> with the cyrus imapd community, perhaps your sample size is too small.

Yeah, there are a few of us here answering help requests, and even
helping debugging in some cases.  I'd be interested to see where
that '0' comes from too.

Still, I think Cyrus and Dovecot are the best two imap servers out
there, so it's going to be a question of which integrates best with
your usage pattern.  For a small server, starting with no experience
in either, I would probably choose Dovecot.  Now that I know Cyrus
inside out, back to front, warts and all - well, I'd choose Cyrus
because I know how to make it play nice.  It's more of a "total
system" in itself though, that you write support stuff around.
Dovecot integrates more with other tools in a unix-daemon'y way.

Enjoy,

Bron ( now if someone came along with a compelling competitior
   for SASL... )

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Deleting messages "marked for deletion" older than X days

2008-08-18 Thread Bron Gondwana
On Mon, Aug 18, 2008 at 05:09:53PM -0500, Kenneth Marshall wrote:
> In the manual page, the definition of the '-X' option seems to
> do what you want:
> 
>   -X expunge-days
>   Expunge  previously deleted messages older than expunge-days
>   (when using the "delayed" expunge mode).  The default is
>   0 (zero) days, which will expunge all previously deleted 
> messages.

Messages go through the following life cycle in "traditional IMAP",
example:

Created (\Recent)- LMTP DELIVER (UID = 9)
No flags ()  - A001 SELECT INBOX (clears the \Recent)
Viewed (\Seen)   - A002 UID FETCH 9 RFC822
Deleted (\Deleted \Seen) - A003 UID STORE 9 +FLAGS (\Deleted)
Purged  (no message) - A004 EXPUNGE

Now, what the -X option actually does is turns this into:

Created (\Recent)- LMTP DELIVER (UID = 9)
No flags ()  - A001 SELECT INBOX (clears the \Recent)
Viewed (\Seen)   - A002 UID FETCH 9 RFC822
Deleted (\Deleted \Seen) - A003 UID STORE 9 +FLAGS (\Deleted)
Purged  (no index record)- A004 EXPUNGE

but the file is still on disk, just the index record has been moved
from the file cyrus.index to a new file cyrus.expunge.  A week later:

Cleaned up (no file) - cyr_expire -X 7

The cyrus.expunge record and the actual spool file itself get deleted at
this point.  Until then you can un-delete the record using the
"unexpunge" command in cyrus 2.3.X.

---

I think what the original requestor was actually looking for is a tool
that can run the "EXPUNGE" phase on a regular basis.  As far as I'm
aware there's nothing that ships with Cyrus that can do it.  If I was
writing something for the job I would make an admin IMAP connection to
Cyrus and just cycle through the folders calling 'EXPUNGE' on them.
Cheap and nasty, but it would do the trick.  You can do this in any
language with a TCP library, though something with an IMAP interface
library would be nicer.  I'd use Perl and Mail::IMAPTalk, but that's
just because that's what I already use!

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Is unixhierarchysep:1 required when using virtdomain?

2008-08-27 Thread Bron Gondwana
On Tue, Aug 26, 2008 at 03:11:41PM +0200, tarjei wrote:
> I read somewhere that setting  unixhierarchysep to true is required when
> using virtdomain, but this is not mentioned on the man page.

Wow, you could have fooled me.
 
> Is there something missing on the manpage, or have I just missunderstood
> something?

We have a few hundred thousand users who are domain split, and
we ddon't use unixhierarchysep.

> Also, what problems will I face if I set it to false? One thing is
> client issues, but what about acls, etc?

Changing it on the fly sounds messy.

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Is unixhierarchysep:1 required when using virtdomain?

2008-08-27 Thread Bron Gondwana

On Wed, 27 Aug 2008 14:36:58 +0200, "Rudy Gevaert" <[EMAIL PROTECTED]> said:
> Bron Gondwana wrote:
> > On Tue, Aug 26, 2008 at 03:11:41PM +0200, tarjei wrote:
> >> I read somewhere that setting  unixhierarchysep to true is required when
> >> using virtdomain, but this is not mentioned on the man page.
> > 
> > Wow, you could have fooled me.
> >  
> >> Is there something missing on the manpage, or have I just missunderstood
> >> something?
> > 
> > We have a few hundred thousand users who are domain split, and
> > we ddon't use unixhierarchysep.
> 
> But then you don't have '.' in their user names, right?

Nope.  Never wanted a dot in a username.  Usernames belong in 
[a-z][a-z0-9_]+ space.  And strictly, trailing _ is pretty
bogus too.

:)

Bron.
-- 
  Bron Gondwana
  [EMAIL PROTECTED]


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Very annoying IMAP problem (cyrus + Outlook)

2008-09-01 Thread Bron Gondwana
On Mon, Sep 01, 2008 at 03:37:19PM -0400, Wesley Craig wrote:
> On 01 Sep 2008, at 14:50, Denis BUCHER wrote:
> > What should I do next to solve my problem ?
> 
> There are actually a couple of places cyrus might give the fatal  
> error "word too long".  The prot routines should be recording the  
> interactions before passing the data up to the imap layer where the  
> parsing error occurs.  Are there any long lines in your telemetry?   
> The bad line should directly proceed "* BYE Fatal error: word too  
> long" in the telemetry (if I'm reading imapd's fatal() routine  
> correctly).
> 
> AFAIK, there's nothing to be done other than adjusting the MAXWORD  
> and/or MAXQUOTED limits.  That means upgrading or recompiling the old  
> version that you're on.

This is what we apply at FastMail:

-MAXQUOTED = 32768,
-MAXWORD = 32768,
+MAXQUOTED = 524288,
+MAXWORD = 524288,

Our smallest IMAP server has 6Gb of memory, so we really don't need
baby-sized buffers :)

Bron.

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Replication errors: missing subscription

2008-09-02 Thread Bron Gondwana
Our Cyrus 2.3.12 + patches replication system has been running very
reliably for months - to the point where the only issues our
checkreplication script tends to find are either:

a) cases where someone has reconstructed and not run quota -f
   afterwards, causing quota mismatches.  (this is mostly the
   fault of bits of our code that need updating!)

b) subscriptions missing on the replica.

I have a suspicion that most of these could be avoided by the simple
expedient of switching from putting individual subscription records
into the sync log to copying the entire user.sub file.

(I've also changed setseen_all to just overwrite the user.seen file 
rather than attempt some sort of merger.  It's a replica, the master
is right!  This will break if you're using a different database type
on the replica than the master of course - but that's why you
shouldn't be sending binary formats over the wire in the first place.
It's already going to break)

Bron.


Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


Re: Skiplist errors on Cyrus 2.3.12

2008-09-02 Thread Bron Gondwana
On Fri, Jul 11, 2008 at 11:37:52AM +0200, Reinhard Zierke wrote:
> Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 637875 
> local6.error] Fatal error: Internal error: assertion failed: 
> cyrusdb_skiplist.c: 622: db->lock_status == UNLOCKED
> Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 558109 
> local6.error] skiplist: closed while still locked

We think we've figured this one out now :)  Finally.  John Capo came up
with a basic patch that fixed it, and I've done a slightly more
ambitious refactor.  Rudy has tested my patch, and we're running it at
FastMail as well.

I've rebuild our webpage with the new patch included.  NOTE: this patch
obsoletes the old readlocktracking patch, and conflicts with it.  This
way is much cleaner.

Bron.

http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-locking-rework-2.3.12.diff

Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html


  1   2   3   4   5   6   7   8   9   10   >