Edit report at http://bugs.php.net/bug.php?id=46685&edit=1
ID: 46685 Updated by: cataphr...@php.net Reported by: scott dot php at scottrix dot co dot uk Summary: SSL code should use select -Status: Assigned +Status: Duplicate Type: Bug Package: OpenSSL related Operating System: * PHP Version: 5.*, 6 -Assigned To: rasmus +Assigned To: Block user comment: N Private report: N New Comment: This was fixed in 5.3.6 (see bug #53592) Previous Comments: ------------------------------------------------------------------------ [2010-07-21 22:53:31] voltara at gmail dot com I have a patch against 5.3.2, but cannot post it because of the error message "Please do not SPAM our bug system." It's the same code, but in the form of a diff applies cleanly. ------------------------------------------------------------------------ [2010-07-21 22:19:21] askalski at gmail dot com I made a somewhat different patch. As far as I can tell, the busy-waiting only happens during the call to php_openssl_enable_crypto, where the socket is forced nonblocking to support the connect timeout. My patch exposes the WANT_READ/WANT_WRITE error code through the return value of handle_ssl_error(), and adds a select() to the SSL_connect loop, using a deadline-based timeout calculation. The diff is against 5.2.9, which is the version I need to fix. It seems to apply cleanly against 5.2.13, but I haven't tested it. (I would need to create a separate patch for 5.3, though.) diff --git a/ext/openssl/xp_ssl.c b/ext/openssl/xp_ssl.c index c31a505..e0a72b1 100644 --- a/ext/openssl/xp_ssl.c +++ b/ext/openssl/xp_ssl.c @@ -92,19 +92,25 @@ static int handle_ssl_error(php_stream *stream, int nr_bytes, zend_bool is_init char *ebuf = NULL, *wptr = NULL; size_t ebuf_size = 0; unsigned long code, ecode; - int retry = 1; + int retry = 0; switch(err) { case SSL_ERROR_ZERO_RETURN: /* SSL terminated (but socket may still be active) */ - retry = 0; break; case SSL_ERROR_WANT_READ: case SSL_ERROR_WANT_WRITE: /* re-negotiation, or perhaps the SSL layer needs more * packets: retry in next iteration */ errno = EAGAIN; - retry = is_init ? 1 : sslsock->s.is_blocked; + if (is_init || sslsock->s.is_blocked) { + /* Return the actual error code, rather than just "1". The SSL_connect + * call is non-blocking for connect_timeout support ("blocking with timeout" + * from the outside perspective.) The caller needs to know whether to + * 'select' on read or write. + */ + retry = err; + } break; case SSL_ERROR_SYSCALL: if (ERR_peek_error() == 0) { @@ -115,7 +121,6 @@ static int handle_ssl_error(php_stream *stream, int nr_bytes, zend_bool is_init } SSL_set_shutdown(sslsock->ssl_handle, SSL_SENT_SHUTDOWN|SSL_RECEIVED_SHUTDOWN); stream->eof = 1; - retry = 0; } else { char *estr = php_socket_strerror(php_socket_errno(), NULL, 0); @@ -123,7 +128,6 @@ static int handle_ssl_error(php_stream *stream, int nr_bytes, zend_bool is_init "SSL: %s", estr); efree(estr); - retry = 0; } break; } @@ -137,7 +141,6 @@ static int handle_ssl_error(php_stream *stream, int nr_bytes, zend_bool is_init switch (ERR_GET_REASON(ecode)) { case SSL_R_NO_SHARED_CIPHER: php_error_docref(NULL TSRMLS_CC, E_WARNING, "SSL_R_NO_SHARED_CIPHER: no suitable shared cipher could be used. This could be because the server is missing an SSL certificate (local_cert context option)"); - retry = 0; break; default: @@ -175,7 +178,6 @@ static int handle_ssl_error(php_stream *stream, int nr_bytes, zend_bool is_init } } - retry = 0; errno = 0; } return retry; @@ -391,12 +393,9 @@ static inline int php_openssl_enable_crypto(php_stream *stream, php_stream_xport_crypto_param *cparam TSRMLS_DC) { - int n, retry = 1; + int n, retry; if (cparam->inputs.activate && !sslsock->ssl_active) { - float timeout = sslsock->connect_timeout.tv_sec + sslsock->connect_timeout.tv_usec / 1000000; - int blocked = sslsock->s.is_blocked; - if (!sslsock->state_set) { if (sslsock->is_client) { SSL_set_connect_state(sslsock->ssl_handle); @@ -406,36 +405,85 @@ static inline int php_openssl_enable_crypto(php_stream *stream, sslsock->state_set = 1; } - if (sslsock->is_client && SUCCESS == php_set_sock_blocking(sslsock->s.socket, 0 TSRMLS_CC)) { - sslsock->s.is_blocked = 0; - } - do { - if (sslsock->is_client) { - struct timeval tvs, tve; - struct timezone tz; + if (sslsock->is_client) { + int blocked = sslsock->s.is_blocked; + struct timeval deadline, timeout; + + /* temporarily force non-blocking, to support connection timeout */ + if (sslsock->s.is_blocked && SUCCESS == php_set_sock_blocking(sslsock->s.socket, 0 TSRMLS_CC)) { + sslsock->s.is_blocked = 0; + } + + /* set a deadline for completing negotiation, based on connect_timeout */ + gettimeofday(&deadline, NULL); + deadline.tv_sec += sslsock->connect_timeout.tv_sec; + deadline.tv_usec += sslsock->connect_timeout.tv_usec; + if (deadline.tv_usec > 1000000) { + deadline.tv_usec -= 1000000; + deadline.tv_sec++; + } + + while ((n = SSL_connect(sslsock->ssl_handle)) <= 0 && + (retry = handle_ssl_error(stream, n, 1 TSRMLS_CC))) + { + fd_set fds; + + FD_ZERO(&fds); + FD_SET(sslsock->s.socket, &fds); + + /* calculate select timeout based on deadline vs current time */ + gettimeofday(&timeout, NULL); + timeout.tv_sec = deadline.tv_sec - timeout.tv_sec; + timeout.tv_usec = deadline.tv_usec - timeout.tv_usec; + if (timeout.tv_usec < 0) { + timeout.tv_usec += 1000000; + timeout.tv_sec--; + } + + /* if there's any time left, select() on read or write */ + if (timeout.tv_sec >= 0) { + if (retry == SSL_ERROR_WANT_READ) { + n = select(sslsock->s.socket + 1, &fds, NULL, NULL, &timeout); + } + else { + n = select(sslsock->s.socket + 1, NULL, &fds, NULL, &timeout); + } + } + else { + n = 0; + } + + /* handle errors */ + if (n <= 0) { + if (n == 0) { + errno = ETIMEDOUT; + php_error_docref(NULL TSRMLS_CC, E_WARNING, "SSL: connection timeout"); + } + else { + char *estr = php_socket_strerror(php_socket_errno(), NULL, 0); + php_error_docref(NULL TSRMLS_CC, E_WARNING, + "SSL: %s", estr); + efree(estr); + } - gettimeofday(&tvs, &tz); - n = SSL_connect(sslsock->ssl_handle); - gettimeofday(&tve, &tz); + if (sslsock->s.is_blocked != blocked && SUCCESS == php_set_sock_blocking(sslsock->s.socket, blocked TSRMLS_CC)) { + sslsock->s.is_blocked = blocked; + } - timeout -= (tve.tv_sec + (float) tve.tv_usec / 1000000) - (tvs.tv_sec + (float) tvs.tv_usec / 1000000); - if (timeout < 0) { - php_error_docref(NULL TSRMLS_CC, E_WARNING, "SSL: connection timeout"); return -1; } - } else { - n = SSL_accept(sslsock->ssl_handle); } - if (n <= 0) { - retry = handle_ssl_error(stream, n, 1 TSRMLS_CC); - } else { - break; + if (sslsock->s.is_blocked != blocked && SUCCESS == php_set_sock_blocking(sslsock->s.socket, blocked TSRMLS_CC)) { + sslsock->s.is_blocked = blocked; } - } while (retry); + } + else { + n = SSL_accept(sslsock->ssl_handle); - if (sslsock->is_client && sslsock->s.is_blocked != blocked && SUCCESS == php_set_sock_blocking(sslsock->s.socket, blocked TSRMLS_CC)) { - sslsock->s.is_blocked = blocked; + if (n <= 0) { + handle_ssl_error(stream, n, 1 TSRMLS_CC); + } } if (n == 1) { ------------------------------------------------------------------------ [2010-07-21 16:56:39] askalski at gmail dot com Any progress on this ticket? We've begun encountering this bug somewhat regularly, whenever one of our third-party SOAP services has problems. The symptom is our entire cluster of PHP servers redlines the CPU for the duration of the SOAP outage. Do you still need a test case reduction? I have reproduced the issue in the latest PHP 5.2 and 5.3 releases. In the meantime, I am going to look over the original submitter's proposed patch, and use that to fix our immediate problem. ------------------------------------------------------------------------ [2009-11-27 18:44:53] paj...@php.net assigned to Rasmus, looks like he is willing to figure out the cause and improve this patch. ------------------------------------------------------------------------ [2008-11-26 13:36:31] scott dot php at scottrix dot co dot uk Seems to me you don't understand, I could easily have found this with a code walk and would not have any test case. The way you are using the SSL libraries is (IMO and the openssl examples) wrong. Doesn't matter if it has any effects on running scripts, it is still wrong. I DO understand that you will want a test to show it has been fixed, but this isn't possible with all coding problems as I'm sure you know, the main thing from a change like this is that it doesn't break any of the existing tests and regression tests that you already have. (or anybody else code that is out there....) The only test I have that shows the problem is an strace on a running script. I don't have permission to release that script and you don't have the server pages and software that the script runs against, so would be useless anyway. I would try to create one, but I personally don't know PHP so can't. Given the problem I would imagine that any PHP script that is talking SSL and having to wait for data to come back would have the same problem (viewable by strace output). Anyway, if you NEED more to be able to apply a patch and test it, then I guess you can close this bug. The important thing for me is that I have tried to get someone upstream interested in writing better code, and that it is searchable so that others that experience the problem can find it and try my fix. ------------------------------------------------------------------------ The remainder of the comments for this report are too long. To view the rest of the comments, please view the bug report online at http://bugs.php.net/bug.php?id=46685 -- Edit this bug report at http://bugs.php.net/bug.php?id=46685&edit=1