Hello! On Tue, Aug 18, 2020 at 05:35:16AM -0400, vergil wrote:
> Maxim Dounin Wrote: > ------------------------------------------------------- > > Hello! > > > > On Fri, Aug 14, 2020 at 10:34:49AM -0400, vergil wrote: > > > > > Maxim Dounin Wrote: > > > ------------------------------------------------------- > > > > Hello! > > > > > > > > On Thu, Aug 13, 2020 at 12:11:54PM -0400, vergil wrote: > > > > > > > > > Maxim Dounin Wrote: > > > > > ------------------------------------------------------- > > > > > > Hello! > > > > > > > > > > > > On Thu, Aug 13, 2020 at 11:39:36AM -0400, vergil wrote: > > > > > > > > > > > > > This one was hard to catch. > > > > > > > > > > > > > > I've captured one error with 30 seconds delta before and > > after > > > > the > > > > > > event. > > > > > > > Where can i attach log file for you? There's 400K messages, > > so i > > > > > > cannot > > > > > > > simple put it here. > > > > > > > > > > > > Attaching the log to the message into the mailing list should > > > > > > work, but I'm not sure it's supported by the (obsolete) forum > > > > > > interface you are using. If not, you may put the log at a > > > > > > convinient place and provide a link here, or attach it to a > > > > > > ticket on trac.nginx.org, or email to me privetely. > > > > > > > > > > I've attached log file to our S3 public storage. You can > > download it > > > > through > > > > > this link: > > > > > > > > > > > https://drive-public-eu.s3.eu-central-1.amazonaws.com/nginx/nginx-debu > > > > g.csv > > > > > > > > > > A note: this is a CSV format from our logging system. I can try > > to > > > > extract > > > > > logs in original format if you need. > > > > > > > > Thanks, but this doesn't seem to contain anything related to the > > > > SSL_shutdown() except the message itself: > > > > > > > > "2020-08-13T15:19:03.279Z","7","shmtx lock", > > > > "2020-08-13T15:19:03.279Z","7","shmtx lock", > > > > "2020-08-13T15:19:03.279Z","7","timer delta: 0", > > > > "2020-08-13T15:19:03.280Z","2","SSL_shutdown() failed (SSL: > > > > error:1409F07F:SSL routines:ssl3_write_pending:bad write retry) > > while > > > > closing request, client: XXX.XXX.XXX.XXX, server: > > > > XXX.XXX.XXX.XXX:443","9140" > > > > "2020-08-13T15:19:03.280Z","7","epoll: fd:322 ev:0005 > > > > d:00007F0A0FCDDEB0", > > > > "2020-08-13T15:19:03.280Z","7","epoll: fd:54 ev:0004 > > > > d:00007F0A0FCDFAC9", > > > > > > > > And nothing else in the log saying "SSL_shutdow()", while there > > > > should be a debug messages like "SSL_shutdown: -1" and > > > > "SSL_get_error: ..." right before the message, and nothing at all > > > > related to the connection 9140. > > > > > > > > It looks like the debug logging is only enabled on the global > > > > level, but disabled at http or server level. Please see the part > > > > starting at "Note that redefining the log without also specifying > > > > the debug level will disable the debugging log" in the "A > > > > debugging log" article > > > > (http://nginx.org/en/docs/debugging_log.html). > > > > > > So... Bad news: i cannot capture the event when full debug enabled. > > Server > > > cannot handle the load and our service partially down at that time. > > > > > > What can i say is that this problem reveal itself on all servers > > with new > > > nginx version. > > > > > > I'll send you the link privately where you can get our > > (some-redacted) > > > config files. > > > > Thank you for your efforts. Just in case, it is possible to > > configure debug logging only for parts of the load - using the > > debug_connection directive with large networks in CIDR notation > > (http://nginx.org/r/debug_connection). It's probably not needed > > in this particular case, see below. > > > > I was able to reporoduce an "SSL_shutdown() failed (SSL: ... bad > > write retry)" error at least in one case, similar to the one > > previously observed with SSL_write() in > > https://trac.nginx.org/nginx/ticket/1194. Previously, this case > > wasn't causing SSL_shutdown() errors, but SSL shutdown fix > > introduced in nginx 1.19.2 revealed the problem. > > > > The following patch should fix this. It was discussed previously > > as a possible fix for other SSL_shutdown() errors fixed in 1.19.2, > > but wasn't commited as there were concerns it will effectively > > disable SSL shutdown in some unrelated cases where c->error flag > > is misused. Now it is more or less clear that the change is > > needed. > > > > Patch (it would be great if you'll be able to test if it fixes the > > problem for you): > > > > # HG changeset patch > > # User Maxim Dounin <mdou...@mdounin.ru> > > # Date 1597531639 -10800 > > # Sun Aug 16 01:47:19 2020 +0300 > > # Node ID be7a3155e00161baf7359ffa73a3a226f1e487c9 > > # Parent 7d46c9f56c9afe34a38bb3aea99550a2fd884280 > > SSL: disabled shutdown after connection errors. > > > > This fixes ""SSL_shutdown() failed (SSL: ... bad write retry)" errors > > as observed on the second SSL_shutdown() call after SSL shutdown fixes > > in > > 09fb2135a589 (1.19.2), notably when sending fails in > > ngx_http_test_expect(), > > similarly to ticket 1194. > > > > Note that there are some places where c->error is misused to prevent > > further output, such as ngx_http_v2_finalize_connection() if there > > are pending streams, or in filter finalization. These places seem > > to be extreme enough to don't care about missing shutdown though. > > For example, filter finalization currently prevents keepalive from > > being used. > > > > diff --git a/src/event/ngx_event_openssl.c > > b/src/event/ngx_event_openssl.c > > --- a/src/event/ngx_event_openssl.c > > +++ b/src/event/ngx_event_openssl.c > > @@ -2793,7 +2793,7 @@ ngx_ssl_shutdown(ngx_connection_t *c) > > return NGX_OK; > > } > > > > - if (c->timedout) { > > + if (c->timedout || c->error) { > > mode = SSL_RECEIVED_SHUTDOWN|SSL_SENT_SHUTDOWN; > > SSL_set_quiet_shutdown(c->ssl->connection, 1); > > > > > > Good day. > > Patch mentioned above solved half the problems. > > SSL_shutdown() failed (SSL: error:1409F07F:SSL > routines:ssl3_write_pending:bad write retry) while processing HTTP/2 > connection > > Still remains in the logs. Do you see any other errors on the same connection before the SSL_shutdown() error? As suggested previously, somethig relevant might be logged at the "info" level. Note that seeing info-level error messages will probably require error logging to be reconfigured, much like with debug. If there is nothing, I'm afraid the only solution would be to try to catch a debugging log related to these errors. As previously suggested, this can be done without too much load by using the debug_connection with relatively large CIDR blocks and waiting for the error to happen from with a client from one of these blocks. -- Maxim Dounin http://mdounin.ru/ _______________________________________________ nginx mailing list nginx@nginx.org http://mailman.nginx.org/mailman/listinfo/nginx