--On 20. November 2007 15:59:18 +0100 Sebastian Hagedorn
<[EMAIL PROTECTED]> wrote:
I can fix this myself, but it's probably easier if you do it.
Just FYI: I fixed it locally with a 3 minute timeout and it compiled fine.
I'll start testing it now.
--
.:.Sebastian Hagedorn - RZKR-R1 (Geb
--On 20. November 2007 09:20:42 -0500 Ken Murchison <[EMAIL PROTECTED]>
wrote:
OK. Can you both try this alternate patch? It should be portable, and
GDB shouldn't cause it to kick out. I've set it up so that for
SSL-wrapped services it will timeout after 3 minutes, otherwise it uses
the serv
Gary Mills wrote:
On Mon, Nov 19, 2007 at 12:35:46PM -0500, Ken Murchison wrote:
Sebastian Hagedorn wrote:
-- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on
17. November 2007 11:21:38 -0500 regarding Re: One more attempt: stuck
processes:
Here's a patch that s
On Mon, Nov 19, 2007 at 12:35:46PM -0500, Ken Murchison wrote:
> Sebastian Hagedorn wrote:
> >-- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on
> >17. November 2007 11:21:38 -0500 regarding Re: One more attempt: stuck
> >processes:
> >
> &g
Sebastian Hagedorn wrote:
> -- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on
> 19. November 2007 12:35:46 -0500 regarding Re: One more attempt: stuck
> processes:
>
>> How are things looking today?
>
> Good! When I just checked I thought
Sebastian Hagedorn wrote:
> -- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on
> 19. November 2007 13:17:07 -0500 regarding Re: One more attempt: stuck
> processes:
>
>> The only other potential downside
>>> the patch has is that stracing or
-- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on 19.
November 2007 13:17:07 -0500 regarding Re: One more attempt: stuck
processes:
The only other potential downside
the patch has is that stracing or gdb'ing it causes the timeout to
trigger prematurely. AFAIK that
-- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on 19.
November 2007 12:35:46 -0500 regarding Re: One more attempt: stuck
processes:
How are things looking today?
Good! When I just checked I thought I'd found a new hanging pop3d process,
because it's been ar
Sebastian Hagedorn wrote:
> -- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on
> 17. November 2007 11:21:38 -0500 regarding Re: One more attempt: stuck
> processes:
>
>> Here's a patch that seems to fix the problem. I did some basic testing
>>
-- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on 17.
November 2007 11:21:38 -0500 regarding Re: One more attempt: stuck
processes:
Here's a patch that seems to fix the problem. I did some basic testing
(Linux only) to make sure that it doesn't break anythi
Sebastian Hagedorn wrote:
> -- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on
> 16. November 2007 15:54:50 -0500 regarding Re: One more attempt: stuck
> processes:
>
>> That's exactly what Gary is seeing.
>
> Right. Apparently stripped b
On Fri, Nov 16, 2007 at 06:37:52PM +0100, Sebastian Hagedorn wrote:
> OK. Still the symptom seems to be different from what I'm seeing.
It may be. As I said I had no time so far to investigate it in depth, I
just wanted to say "mee too" for the hung process problem.
> Could it be that you have a
On Nov 16, 2007 6:24 PM, Alain Spineux <[EMAIL PROTECTED]> wrote:
> Hi
>
> Can I resume the problem in :
I'm wrong
>
> The server is blocked in a read, waiting for the client next command.
> (this is normal,
> 99% of the process are in this state).
No it is waiting in select, and the select has
-- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on 16.
November 2007 15:54:50 -0500 regarding Re: One more attempt: stuck
processes:
That's exactly what Gary is seeing.
Right. Apparently stripped binaries aren't any good for straces.
Its blocki
On Fri, Nov 16, 2007 at 03:54:50PM -0500, Ken Murchison wrote:
>
> That's exactly what Gary is seeing. Its blocking in SSL_accept().
> Apparently the client connects to port 995, and then either sends
> nothing, or goes away and leaves the socket open.
>
> I've reproduced the former by telneti
--On Friday, November 16, 2007 3:54 PM -0500 Ken Murchison
<[EMAIL PROTECTED]> wrote:
> I've reproduced the former by telneting to port 995 and doing nothing.
> I have been unable to reproduce the latter because as soon as I QUIT the
> telnet session or kill() the telnet process, pop3d exits grac
Sebastian Hagedorn wrote:
> -- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on
> 16. November 2007 12:58:49 -0500 regarding Re: One more attempt: stuck
> processes:
>
>>> So should I add a call to ERR_get_error()?
>>
>>
>> Not yet.
-- Ken Murchison <[EMAIL PROTECTED]> is rumored to have mumbled on 16.
November 2007 12:58:49 -0500 regarding Re: One more attempt: stuck
processes:
So should I add a call to ERR_get_error()?
Not yet. I'm assuming that none of these processes has hung. We're
getting a
I know it has been asked before and may be redundant, but... You
answered that cyrus-sasl is using /dev/urandom and should not run out of
entropy. However, what about openssl itself? It also uses random
numbers. Perhaps, as a test renaming /dev/random and ln -s /dev/urandom
/dev/random.
Ga
--On 16. November 2007 11:27:52 -0500 Ken Murchison <[EMAIL PROTECTED]>
wrote:
Sebastian Hagedorn wrote:
The only reason I could imagine for the sequence of calls was signal
handling. But let's be methodical. There's only one spot where
SSL_accept() is called: in tls_start_servertls(). In pop
Sebastian Hagedorn wrote:
> Nov 16 18:00:26 lvr13 pop3s[3847]: SSL_read() returned 0
> Nov 16 18:00:34 lvr13 pop3s[3215]: SSL_read() returned 0
> Nov 16 18:00:34 lvr13 pop3s[3199]: SSL_read() returned 0
> Nov 16 18:00:39 lvr13 pop3s[3199]: SSL_read() returned 0
> Nov 16 18:00:43 lvr13 pop3s[3229]:
--On 16. November 2007 18:21:21 +0100 Gabor Gombas <[EMAIL PROTECTED]>
wrote:
On Fri, Nov 16, 2007 at 06:11:01PM +0100, Sebastian Hagedorn wrote:
Well, that just sounds like you're running out of entropy. That's a
different issue. Recompile your cyrus-sasl to use /dev/urandom instead
of /dev
Hi
Can I resume the problem in :
The server is blocked in a read, waiting for the client next command.
(this is normal,
99% of the process are in this state). But the autologout procedure is
not working!
Then this means the SIGALRM that should awake the process never come or is not
handled prope
OK, now I got this:
Nov 16 18:37:06 lvr13 pop3s[23089]: SSL_read() returned -1
But that process terminated normally.
--
.:.Sebastian Hagedorn - RZKR-R1 (Gebäude 52), Zimmer 18.:.
Zentrum für angewandte Informatik - Universitätsweiter Service RRZK
.:.Universität zu Köln / Cologne University -
On Fri, Nov 16, 2007 at 05:13:13PM +0100, Sebastian Hagedorn wrote:
> --On 16. November 2007 14:23:17 +0100 Simon Matter <[EMAIL PROTECTED]>
> wrote:
>
> >Did you ever see non SSL connections get stuck?
>
> No.
Most of mine are `pop3d -s', but I have seen a few without the `-s'.
When I did a st
--On 16. November 2007 12:39:28 -0500 Ken Murchison <[EMAIL PROTECTED]>
wrote:
Sorry, my patch wasn't complete. It wasn't logging the value that I
wanted.
OK:
Nov 16 18:48:17 lvr13 pop3s[1385]: SSL_read() returned 0:5
Nov 16 18:48:33 lvr13 pop3s[1375]: SSL_read() returned 0:5
Nov 16 18:48:5
Sebastian Hagedorn wrote:
> --On 16. November 2007 12:39:28 -0500 Ken Murchison
> <[EMAIL PROTECTED]> wrote:
>
>> Sorry, my patch wasn't complete. It wasn't logging the value that I
>> wanted.
>
> OK:
>
> Nov 16 18:48:17 lvr13 pop3s[1385]: SSL_read() returned 0:5
> Nov 16 18:48:33 lvr13 pop3s[
On Nov 16, 2007 6:11 PM, Sebastian Hagedorn <[EMAIL PROTECTED]> wrote:
> --On 16. November 2007 18:07:51 +0100 Gabor Gombas <[EMAIL PROTECTED]>
> wrote:
>
> >> Hm, we don't suffer any actual slowdown, it's just that the number of
> >> processes increases over time.
> >
> > It's not a slowdown - the
On Fri, Nov 16, 2007 at 06:11:01PM +0100, Sebastian Hagedorn wrote:
> Well, that just sounds like you're running out of entropy. That's a
> different issue. Recompile your cyrus-sasl to use /dev/urandom instead of
> /dev/random or disable apop in /etc/imapd.conf:
Debian uses /dev/urandom for a
--On 16. November 2007 18:07:51 +0100 Gabor Gombas <[EMAIL PROTECTED]>
wrote:
Hm, we don't suffer any actual slowdown, it's just that the number of
processes increases over time.
It's not a slowdown - the client connects, and hangs. It never even gets
to the authentication phase (at least it'
Sebastian Hagedorn wrote:
> --On 16. November 2007 09:37:42 -0600 Gary Mills <[EMAIL PROTECTED]>
> wrote:
>
>>> Could you get a stack trace? If you have gdb you just call it with "gdb
>>> -p 19175". Then you can do "bt" at the prompt. I forget how to do it
>>> with Sun's debugger.
>>
>> Easy:
>
On Fri, Nov 16, 2007 at 05:20:00PM +0100, Sebastian Hagedorn wrote:
> That's a 2.6 kernel, right?
Yes, 2.6.18-2-amd64.
> Hm, we don't suffer any actual slowdown, it's just that the number of
> processes increases over time.
It's not a slowdown - the client connects, and hangs. It never even ge
--On 16. November 2007 14:23:17 +0100 Simon Matter <[EMAIL PROTECTED]>
wrote:
Did you ever see non SSL connections get stuck?
No.
--
.:.Sebastian Hagedorn - RZKR-R1 (Gebäude 52), Zimmer 18.:.
Zentrum für angewandte Informatik - Universitätsweiter Service RRZK
.:.Universität zu Köln / Colo
--On 16. November 2007 09:37:42 -0600 Gary Mills <[EMAIL PROTECTED]>
wrote:
Could you get a stack trace? If you have gdb you just call it with "gdb
-p 19175". Then you can do "bt" at the prompt. I forget how to do it
with Sun's debugger.
Easy:
# pstack 19175
19175: pop3d -s
fef9f81
Sebastian Hagedorn wrote:
> The only reason I could imagine for the sequence of calls was signal
> handling. But let's be methodical. There's only one spot where
> SSL_accept() is called: in tls_start_servertls(). In pop3d.c that's only
> called in cmd_starttls(). That in turn is called either
--On 16. November 2007 13:54:24 +0100 Alain Spineux <[EMAIL PROTECTED]>
wrote:
On Nov 16, 2007 12:36 PM, Sebastian Hagedorn <[EMAIL PROTECTED]>
wrote:
I just had a discussion with a colleague regarding this. He made two
observations:
1. In the absence of the SO_KEEPALIVE option it is entirely
--On 16. November 2007 16:52:27 +0100 Gabor Gombas <[EMAIL PROTECTED]>
wrote:
On Fri, Nov 16, 2007 at 12:36:49PM +0100, Sebastian Hagedorn wrote:
He suggested that the trace is unreliable. Perhaps a bug in RHEL 3's
version of OpenSSL messes up the stack. That would also explain why
nobody el
On Fri, Nov 16, 2007 at 12:36:49PM +0100, Sebastian Hagedorn wrote:
> He suggested that the trace is unreliable. Perhaps a bug in RHEL 3's
> version of OpenSSL messes up the stack. That would also explain why nobody
> else seems to have this problem.
FYI I also know a system that has problems w
On Fri, Nov 16, 2007 at 03:20:57PM +0100, Sebastian Hagedorn wrote:
> --On 16. November 2007 08:00:07 -0600 Gary Mills <[EMAIL PROTECTED]>
> wrote:
>
> >This timeout doesn't work in some cases. We have lots of POP sessions
> >that never terminate.
>
> That's interesting to hear! Especially sinc
Sebastian Hagedorn wrote:
> I think I will try one more approach: I reverted cyrus.conf to not use
> "-U 1" anymore, so that processes should be reused. I will strace one of
> the pop3d processes in the hope that it gets stuck. That way I should be
> able to see where things go wrong. If the pr
--On 16. November 2007 08:00:07 -0600 Gary Mills <[EMAIL PROTECTED]>
wrote:
This timeout doesn't work in some cases. We have lots of POP sessions
that never terminate.
That's interesting to hear! Especially since you are using Solaris.
About 30 out of 40 are in that state now.
Here's an e
On Fri, Nov 16, 2007 at 01:54:24PM +0100, Alain Spineux wrote:
> On Nov 16, 2007 12:36 PM, Sebastian Hagedorn <[EMAIL PROTECTED]> wrote:
> > --On 16. November 2007 11:27:09 +0100 Sebastian Hagedorn
> > <[EMAIL PROTECTED]> wrote:
> >
> > 1. In the absence of the SO_KEEPALIVE option it is entirely po
On Nov 16, 2007 12:36 PM, Sebastian Hagedorn <[EMAIL PROTECTED]> wrote:
> --On 16. November 2007 11:27:09 +0100 Sebastian Hagedorn
> <[EMAIL PROTECTED]> wrote:
>
> >> 1) Since it only happens on dialup connections, could it be that the
> >> dialin router at the providers end sends TCP/RST when a cl
--On 16. November 2007 11:27:09 +0100 Sebastian Hagedorn
<[EMAIL PROTECTED]> wrote:
1) Since it only happens on dialup connections, could it be that the
dialin router at the providers end sends TCP/RST when a client hangs up
and those packets are filtered somewhere, maybe on your firewall?
OK
--On 15. November 2007 19:25:19 +0100 Simon Matter <[EMAIL PROTECTED]>
wrote:
It's blinking red, which normally means a broken link. I'm not sure how
The file 0 is a symbolic symlink which doesn't really point to a file,
that's why the shell shows it blinking. Everything okay here.
Thanks.
> --On 15. November 2007 18:14:05 +0100 Alain Spineux <[EMAIL PROTECTED]>
> wrote:
>
>>> # strace -p 25038
>>> Process 25038 attached - interrupt to quit
>>> read(0,
>>
>> Do you know what is 0, if it was a socket it should timeout, isn't it ?
>
> It should, I guess, but it doesn't.
>
>># ls -l /
--On 15. November 2007 18:14:05 +0100 Alain Spineux <[EMAIL PROTECTED]>
wrote:
# strace -p 25038
Process 25038 attached - interrupt to quit
read(0,
Do you know what is 0, if it was a socket it should timeout, isn't it ?
It should, I guess, but it doesn't.
# ls -l /proc/25038/fd
should
On Nov 15, 2007 4:54 PM, Sebastian Hagedorn <[EMAIL PROTECTED]> wrote:
> --On 15. November 2007 08:32:18 -0500 Ken Murchison <[EMAIL PROTECTED]>
> wrote:
>
> >>> Since it looks like things are hanging when a process is being used, I'd
> >>> like to see if the problem goes away if we don't reuse the
Sebastian Hagedorn wrote:
> --On 15. November 2007 08:32:18 -0500 Ken Murchison
> <[EMAIL PROTECTED]> wrote:
>
Since it looks like things are hanging when a process is being used,
I'd
like to see if the problem goes away if we don't reuse the processes.
I'm just trying to do
--On 15. November 2007 11:00:39 -0500 Ken Murchison <[EMAIL PROTECTED]>
wrote:
(gdb) bt
# 0 0x0079f41e in __read_nocancel () from /lib/tls/libc.so.6
# 1 0x00d0b2f7 in BIO_new_socket () from /lib/libcrypto.so.4
# 2 0x00d092b2 in BIO_read () from /lib/libcrypto.so.4
# 3 0x005dae13 in ssl23_re
Sebastian Hagedorn wrote:
> --On 15. November 2007 08:21:48 -0500 Ken Murchison
> <[EMAIL PROTECTED]> wrote:
>
>>> No. Since this potentially affects all IMAP and POP processes I would
>>> have to do it for all entries. Do you recommend that I try that?
>>
>> Since it looks like things are hangin
--On 15. November 2007 08:32:18 -0500 Ken Murchison <[EMAIL PROTECTED]>
wrote:
Since it looks like things are hanging when a process is being used, I'd
like to see if the problem goes away if we don't reuse the processes.
I'm just trying to do a bsearch() on the problem.
OK. I've made the cha
--On 15. November 2007 08:21:48 -0500 Ken Murchison <[EMAIL PROTECTED]>
wrote:
No. Since this potentially affects all IMAP and POP processes I would
have to do it for all entries. Do you recommend that I try that?
Since it looks like things are hanging when a process is being used, I'd
like t
Sebastian Hagedorn wrote:
> No. Since this potentially affects all IMAP and POP processes I would
> have to do it for all entries. Do you recommend that I try that?
Since it looks like things are hanging when a process is being used, I'd
like to see if the problem goes away if we don't reuse th
--On 14. November 2007 16:39:44 -0500 Ken Murchison <[EMAIL PROTECTED]>
wrote:
It looks to me like we are timing out the client while the client is
IDLEing, but we get a signal from idled in the middle of shutdown(). Try
this patch.
--- imapd.c.~1.535.~2007-11-14 16:16:21.0 -0500
+
--On 15. November 2007 06:55:44 -0500 Ken Murchison <[EMAIL PROTECTED]>
wrote:
OK. What version of OpenSSL?
cyradm says:
Built w/OpenSSL 0.9.7a Feb 19 2003
Running w/OpenSSL 0.9.7a Feb 19 2003
rpm says:
openssl-0.9.7a-33.23
This is RHEL 3.
Are they imaps/pop3s p
Sebastian Hagedorn wrote:
> Thanks. I will try this patch as soon as I can, but it's clearly not the
> only issue, because the same thing happens with POP processes. Here's an
> example for one:
>
> (gdb) bt
> #0 0x0096441e in __read_nocancel () from /lib/tls/libc.so.6
> #1 0x00ac02f7 in BIO_
Sebastian Hagedorn wrote:
> Hi,
>
> I've brought up this topic before. We've been running cyrus-imapd very
> happily for several years. Yet there's one issue that none of the
> updates have resolved. The last time I reported it we were running
> 2.2.12. Now we're running 2.3.8, but the issues i
--On 14. November 2007 09:30:45 -0600 Gary Mills <[EMAIL PROTECTED]>
wrote:
On Wed, Nov 14, 2007 at 04:15:13PM +0100, Sebastian Hagedorn wrote:
I've brought up this topic before. We've been running cyrus-imapd very
happily for several years. Yet there's one issue that none of the
updates hav
On Wed, Nov 14, 2007 at 04:15:13PM +0100, Sebastian Hagedorn wrote:
>
> I've brought up this topic before. We've been running cyrus-imapd very
> happily for several years. Yet there's one issue that none of the updates
> have resolved. The last time I reported it we were running 2.2.12. Now
> w
Hi,
I've brought up this topic before. We've been running cyrus-imapd very
happily for several years. Yet there's one issue that none of the updates
have resolved. The last time I reported it we were running 2.2.12. Now
we're running 2.3.8, but the issues is the same: POP and IMAP processes
t
61 matches
Mail list logo