Lachlan Michael wrote:

Real puzzler.  I'm surprised not to have at least one process growing,
though.  Maybe it's not using much CPU and you're not spotting it.
Following you advice, as far as I can tell, the mailman qrunner process

/usr/local/bin/python2.5 /usr/local/mailman/bin/qrunner
--runner=IncomingRunner:0:1 -s

is the one that crashes: all other mailman processes are unaffected. I
couldn't see it increase much in size (maybe it went from 8.5M to 12.5M),
then it just bombed and a new process was spawned (easy to tell by the
large increase in PID).
All I can think us that qrunner asks for such a large amount of memory in one go, that it bombs out without ever growing. That fits with the ktrace output as well. Regretably, I don't think you can tell *how* much memory was asked for. (The normal pattern with out of memory errors is for the process to grow and grown and grow and die; but it's not the only one).
Other things to try:  Up the stack size
   ulimit -s 262144

inside the mailman startup.  Again, I've had processes in the past which
needed this.
Ok, I am going to gradually try different limits. It seems as though setting
kern.maxssiz="256M"
and so on in /boot/loader.conf will allow me to increase the limits.
Having to reboot is a pain, though. How far can I go? 512M? (Physical
memory is 1GB)
Certainly not more than physical memory :-) To be honest, if 256M doesn't do it then this probably isn't the problem. I'm not particularly hopeful that this will do it, but in your circumstance I would try it. At the same time, you could also increase the data size (maxdsiz?) to 1Gb (yours looks like 0.5Gb, half your physical memory).

My limit settings (also 1Gb) look like:

datasize     1048576 kbytes
stacksize    262144 kbytes

which come from trying to set 256Mb and 1024Mb in the kernel config (old FreeBSD - no sysctls).

Keep the ulimit -a in the mailman startup script so you can confirm that you really get these numbers.

Can you email a file of the size your are
trying not through mailman?  Maybe your MTA (sendmail/postfix etc) has a
limit that somehow causes mailman to get this error.

This is definitely not the case. Users can receive (and send) similar
sized large attachments individually, so the MTA (sendmail in this case)
is not the cause.
OK - rule that out. The ktrace showing qrunner failing a break pretty much does that too.

The final suggestion is to try to trace (ktrace, strace from ports) the
process that is dying,
I'll admit it is my first time to try a ktrace, but after noting which
process it was that crashed I could identify the newly spawned PID, and
obtained a ktrace.out (binary) and a kdump  (called
mailman_process_log.txt) when the problems occurs by sending another large
mail attachment.  I'll leave the files up for a couple of days. (Both
files are about 2MB in size)

http://lachlan.lkla.org/tmp/mailman_memory_error/

Not that I can properly interpret the results, but it seems the mail file
is completely read, but whatever happens next causes the memory error.

52506 python2.5 RET   read 354/0x162
52506 python2.5 CALL  break(0x8add000)
52506 python2.5 RET   break 0
52506 python2.5 CALL  break(0x8cc3000)
52506 python2.5 RET   break -1 errno 12 Cannot allocate memory
The kdump output is the only useful bit, really. Your analysis seems correct to me.

You are also getting a stack trace from python when it exits with the "out of memory" error. ktrace is just showing python printing the stuff - it may be that the error also ends up in a log file somewhere - don't know where mailman logs, sorry. From that stack trace it should be possible to figure out which line of the python is actually causing that memory request. My bet is on one of the cPickle lines, but it would be nice to see the stack trace "raw" so to speak. Maybe that stack trace would help someone on the mailman list suggest something else.


Did you already try sending a different kind of attachment that's the same kind of size (a bit bigger would be better). Maybe it's something about the attachment itself that's causing the issue?


As a final resort, if none of the above resolves or leads to clues, I would try uninstalling python2.5 and installing python2.4 *just in case*. I'm assuming that you only have python for mailman. (If you have real python users then it's trickier. You can install multiple versions of python but possibly not from ports. But python always compiled cleanly from tarball on FreeBSD for me. I can offer some help with that process if you really need it).


I can't help thinking that 500Kb is a very small attachment and I can't really see why it would legitimately cause a request for so much memory that your settings aren't handling it.


A quick look at the mailman web site shows that you can run qrunner from the command line - couldn't immediately find the man page though. If you could somehow queue up the email with Mailman switched off, you could run qrunner by hand and then you'd definitely get the python backtrace. Maybe the mailman list, or a mailman admin here, can help with that, if you need it.


Running out of ideas.

--Alex

_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to