Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-14 Thread João Assad
Derrick J Brashear wrote: is there a munmap for every mmap when I start cyrus, both mupdate master and slave comes up, so I get cyrus/mupdate[2678]: MMAP at map_refresh: mmap(0, 408084480, PROT_READ, MAP_SHARED, 8, 0) = 1098067968 cyrus/mupdate[2679]: MMAP at map_refresh: mmap(0, 408084480, PROT_

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-13 Thread João Assad
João Assad wrote: yet another gdb backtrace of the mmap problem http://www.gazzag.com/gdb.output2.gz regards Hey Derrick, did you find anything usefull in this last backtrace ? Regards --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/In

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-12 Thread João Assad
yet another gdb backtrace of the mmap problem http://www.gazzag.com/gdb.output2.gz regards --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-11 Thread João Assad
got a full gdb backtrace of the mmap crash. Its 400k so you can download it from: http://www.gazzag.com/gdb.output.gz regards --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-09 Thread João Assad
Out of curiosity.. everytime cyrus needs to mmap/munmap something. it uses its mmap encapsulation, which is map_refresh and map_free. So every once in a while I get this on strace. 14:27:02.148097 munmap(0x54968000, 373915648) = 0 14:27:02.148765 mmap2(NULL, 373932032, PROT_READ, MAP_SHARED, 8, 0

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-09 Thread João Assad
Henrique de Moraes Holschuh wrote: Just thought of something. Please set the vm.overcommit_memory syscall to 2 (it is available in /proc/sys, I think. But the right way is to use /etc/sysctl.conf and sysctl). Make *really* sure you have enough swap when you do that. You will *really* need it. Som

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-09 Thread Henrique de Moraes Holschuh
On Mon, 04 Apr 2005, João Assad wrote: > >The following options seem to have a direct impact on how fast I run > >out of resources (obviously) . The more I increase them, the faster I > >get the mmap error. > > > >*mupdate_workers_start > >mupdate_workers_minspare > >mupdate_workers_maxspare > >m

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
Derrick J Brashear wrote: So, prior to this presumably you've mmap2()'d some memory, have there been any munmaps for these mmap2s 00:25:58.583761 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb68e8000 00:25:58.585461 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRI

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread Derrick J Brashear
On Fri, 8 Apr 2005, João Assad wrote: I could do a strace -f wich would dump all the traces from all the threads into a single file... but its a nightmare to read it. by reading some strace output here I've noticed mmaps complaining about ENOMEM way before the mmap inside map_refresh goes crazy.

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
João Assad wrote: Derrick J Brashear wrote: curiously, the strace output isn't showing an mmap() call fail, that I see, before the error shows up. I could do a strace -f wich would dump all the traces from all the threads into a single file... but its a nightmare to read it. by reading some strac

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
João Assad wrote: Derrick J Brashear wrote: On Fri, 8 Apr 2005, João Assad wrote: João Assad wrote: Managed to get a backtrace using debug_command ( thanks for this nifty feature Henrique de Moraes ) 2 gdb backtraces from the production server. curiously, the strace output isn't showing an mmap(

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread Derrick J Brashear
On Fri, 8 Apr 2005, João Assad wrote: João Assad wrote: Managed to get a backtrace using debug_command ( thanks for this nifty feature Henrique de Moraes ) 2 gdb backtraces from the production server. curiously, the strace output isn't showing an mmap() call fail, that I see, before the error sho

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
João Assad wrote: Managed to get a backtrace using debug_command ( thanks for this nifty feature Henrique de Moraes ) and now a strace from the production server.. Im sending just the last few lines of it. its really big. 16:18:02.399469 accept(4, 0, NULL) = 104 16:18:02.470386 getpeername(1

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-08 Thread João Assad
João Assad wrote: Managed to get a backtrace using debug_command ( thanks for this nifty feature Henrique de Moraes ) 2 gdb backtraces from the production server. #18988 0x0804dcd3 in fatal ( s=0x8d52f070 "Internal error: assertion failed: mupdate.c: 586: 0", cod

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread João Assad
João Assad wrote: Derrick J Brashear wrote: On Thu, 7 Apr 2005, João Assad wrote: Ok I got a backtrace ( I think ) . I dont really know how to use gdb did you compile without giving gcc the -g option? Probably. Having unstripped binaries with useful symbols would probably make for a more useful

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread João Assad
Derrick J Brashear wrote: On Thu, 7 Apr 2005, João Assad wrote: Ok I got a backtrace ( I think ) . I dont really know how to use gdb did you compile without giving gcc the -g option? Probably. Having unstripped binaries with useful symbols would probably make for a more useful backtrace. (at lea

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread Derrick J Brashear
On Thu, 7 Apr 2005, João Assad wrote: Ok I got a backtrace ( I think ) . I dont really know how to use gdb did you compile without giving gcc the -g option? Probably. Having unstripped binaries with useful symbols would probably make for a more useful backtrace. (at least i hope so)

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread João Assad
Derrick J Brashear wrote: On Wed, 6 Apr 2005, João Assad wrote: and then give us a backtrace from the core which you will then get? After doing that, the mupdate process now exits with signal 11 as expected. OTOH the core isnt getting dumped to disk for some reason... The only internal resource

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread João Assad
Derrick J Brashear wrote: On Wed, 6 Apr 2005, João Assad wrote: and then give us a backtrace from the core which you will then get? After doing that, the mupdate process now exits with signal 11 as expected. OTOH the core isnt getting dumped to disk for some reason... The only internal resource

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-07 Thread Derrick J Brashear
On Wed, 6 Apr 2005, João Assad wrote: and then give us a backtrace from the core which you will then get? After doing that, the mupdate process now exits with signal 11 as expected. OTOH the core isnt getting dumped to disk for some reason... The only internal resource limit play happens for fds,

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-06 Thread João Assad
Derrick J Brashear wrote: I've noticed that once I get the mmap error, the system will still run without spitting db errors for anywhere from a few mins to a few hours. Also, I never get more than one mmap error before the db becomes unnusable. that's not shocking. i'd still like to know why m

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-06 Thread Derrick J Brashear
On Wed, 6 Apr 2005, João Assad wrote: cyrus/mupdate[12614]: IOERROR: mapping /var/lib/imap/mailboxes.db file: Cannot allocate memory Resource limited memory, or are you really running out of memory? Letting processes continue running in the face of an mmap failure needs to be re-examined I guess

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-06 Thread João Assad
cyrus/mupdate[12614]: IOERROR: mapping /var/lib/imap/mailboxes.db file: Cannot allocate memory Resource limited memory, or are you really running out of memory? Letting processes continue running in the face of an mmap failure needs to be re-examined I guess. Hello again. I've been trying to t

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-04 Thread João Assad
The following options seem to have a direct impact on how fast I run out of resources (obviously) . The more I increase them, the faster I get the mmap error. *mupdate_workers_start mupdate_workers_minspare mupdate_workers_maxspare mupdate_workers_max I have them all set to the default values

Re: cyrus-murder problems with database corruption in the frontend/master

2005-04-01 Thread João Assad
João Assad wrote: The system have plenty of RAM available and ulimit -a reports the max virtual memory as unlimited core file size(blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited max locked memory (kbytes, -l) 32 max memory size

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread João Assad
cyrus/mupdate[12614]: IOERROR: mapping /var/lib/imap/mailboxes.db file: Cannot allocate memory cyrus/mupdate[12614]: failed to mmap /var/lib/imap/mailboxes.db file cyrus/master[12580]: service mupdate pid 12614 in READY state: terminated abnormally Henrique de Moraes Holschuh wrote: Chec

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread Henrique de Moraes Holschuh
On Thu, 31 Mar 2005, João Assad wrote: > >>My db just got corrupted again 2 hours ago. seems like moving 200k > >>mailboxes between backends really speed it up. ... > cyrus/mupdate[12614]: IOERROR: mapping /var/lib/imap/mailboxes.db file: > Cannot allocate memory > cyrus/mupdate[12614]: failed

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread Derrick J Brashear
On Thu, 31 Mar 2005, João Assad wrote: cyrus/mupdate[12614]: IOERROR: mapping /var/lib/imap/mailboxes.db file: Cannot allocate memory Resource limited memory, or are you really running out of memory? Letting processes continue running in the face of an mmap failure needs to be re-examined I guess

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread João Assad
João Assad wrote: I needed to restart cyrus master and got this error minutes after doing it. cyrus/mupdate[12614]: IOERROR: mapping /var/lib/imap/mailboxes.db file: Cannot allocate memory cyrus/mupdate[12614]: failed to mmap /var/lib/imap/mailboxes.db file cyrus/master[12580]: service mupdate p

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread João Assad
Derrick J Brashear wrote: On Thu, 31 Mar 2005, João Assad wrote: My db just got corrupted again 2 hours ago. seems like moving 200k mailboxes between backends really speed it up. I have my corrupted mailboxes.db and I can send it to anyone interested in taking a look. I needed to restart cyrus m

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread Derrick J Brashear
On Thu, 31 Mar 2005, João Assad wrote: My db just got corrupted again 2 hours ago. seems like moving 200k mailboxes between backends really speed it up. I have my corrupted mailboxes.db and I can send it to anyone interested in taking a look. how about a url for it?

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread João Assad
Derrick J Brashear wrote: On Thu, 31 Mar 2005, João Assad wrote: My db just got corrupted again 2 hours ago. seems like moving 200k mailboxes between backends really speed it up. I have my corrupted mailboxes.db and I can send it to anyone interested in taking a look. how about a url for it? the

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread João Assad
Henrique de Moraes Holschuh wrote: On Thu, 31 Mar 2005, Sergio Devojno Bruder wrote: We've done 2 murder instalations, cyrus 2.2.3 and cyrus 2.2.10, and already have seen the same type of corruption. (should ADD or DELETE), but not with the same frequency. Are you guys running the same di

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread Henrique de Moraes Holschuh
On Thu, 31 Mar 2005, Sergio Devojno Bruder wrote: > We've done 2 murder instalations, cyrus 2.2.3 and cyrus 2.2.10, and already > have seen the same type of corruption. (should ADD or DELETE), but not with > the same frequency. Are you guys running the same distribution or the same kernel? It lo

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-31 Thread Sergio Devojno Bruder
Derrick J Brashear wrote: Well, you have skiplist corruption, but there's not really anything in your report which is helpful at suggesting why you do, or helping to reproduce it so (if it is a bug) it can be tracked and killed. Even posting your corrupted skiplist would be more useful. >> >> Ye

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-30 Thread Derrick J Brashear
Well, you have skiplist corruption, but there's not really anything in your report which is helpful at suggesting why you do, or helping to reproduce it so (if it is a bug) it can be tracked and killed. Even posting your corrupted skiplist would be more useful. Yeah I know the info I gave isn

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-30 Thread João Assad
Derrick J Brashear wrote: On Tue, 29 Mar 2005, João Assad wrote: Come on guys, someone must have at least an idea I can try. Anything will help, maybe Im missing something obvious. Well, you have skiplist corruption, but there's not really anything in your report which is helpful at suggesting wh

Re: cyrus-murder problems with database corruption in the frontend/master

2005-03-30 Thread Derrick J Brashear
On Tue, 29 Mar 2005, João Assad wrote: Come on guys, someone must have at least an idea I can try. Anything will help, maybe Im missing something obvious. Well, you have skiplist corruption, but there's not really anything in your report which is helpful at suggesting why you do, or helping to re