Javier Fernández-Sanguino Peña schrieb:

Hm...

> > As I suspected the bug occured again after I updated my system with 
> > aptitude.
> Strange..

I am not so sure of this connection any more... I just restarted cron
by accident (while trying to attach a debugger to it, because I
couldn't think of a better method to catch the child than to break on
"getpid" in gdb, I inadvertently killed and restarted cron). After
that, cronjobs worked again and to reproduce the bug, I did an update
with aptitude. Now cron is still working (unfortunately).

Other candidates for succesful reproduction of the bug are

 - wait a few days (sounds funny, I know)

 - Crashes of the machine (we had a powerfailure in our flat
   yesterday, and I had some problems with broken memory lately)

> - monitor when cron launches its children (pid A)
> - send kill -SIGSTOP to pid A as soon as you see it
> - run 'gdb /usr/sbin/cron A'
> - send 'kill -SIGCONT A' and 'run' in gdb.
> - when you get the sigsegv in the cron job run 'bt'

The child segfaults very quickly. Can you think of a way to stop the
child automatically?

I tried

strace -f -p `pidof cron` -o '|sh -c "grep getpid|awk \"{print $5}\"|while read 
pid; do echo PID: $pid; kill -STOP $pid; done"'

But there's a bug in that line that I can't spot.

Would it work to simply stop cron and attach gdb to it and then
"cont"?

Well, I'll see when this bug reappears.


-- 
        Friedrich Delgado Friedrichs <[EMAIL PROTECTED]>

Attachment: signature.asc
Description: Digital signature

Reply via email to