Rainer, On 7/18/16 5:48 PM, Rainer Jung wrote: > Hi Chris, > > thanks for picking this topic. > > Am 18.07.2016 um 17:02 schrieb Christopher Schultz: >> All, >> >> Michael Deiner found a buffer overflow in the call to FD_SET macro on >> line 291 of jk_connect.c: >> >> 280> do { >> 281> rc = connect(sd, (const struct sockaddr *)&addr->sa.sin, >> addr->salen); >> 282> } while (rc == -1 && errno == EINTR); >> 283> >> 284> if ((rc == -1) && (errno == EINPROGRESS || errno == EALREADY) >> 285> && (timeout > 0)) { >> 286> fd_set wfdset; >> 287> struct timeval tv; >> 288> socklen_t rclen = (socklen_t)sizeof(rc); >> 289> >> 290> FD_ZERO(&wfdset); >> *291> FD_SET(sd, &wfdset);* >> 292> tv.tv_sec = timeout / 1000; >> 293> tv.tv_usec = (timeout % 1000) * 1000; >> 294> rc = select(sd + 1, NULL, &wfdset, NULL, &tv); >> >> I'd like to fix this so it won't bring-down the server :) >> >> But it quickly gets complicated. >> >> The method itself takes "sd" (a jk_sock_t) as an argument to the >> function, and we can check immediately whether it will cause FD_SET to >> overflow -- easy: just check to see if the value is too large -- but >> what should we do in that case? >> >> This function should be connecting to a back-end Tomcat server, but if >> we have too many outbound connections, we'll fail. >> >> I'm not sure it makes any sense to let things get this far. >> >> The proposed solution[1] is to use poll() instead of select(), but that >> won't work on every platform, plus I'd like to be able to fix the buffer >> overflow right away while we work-out a solution for poll() that will >> work in most environments. >> >> I think if the connection_pool_size exceeds FD_SETSIZE we should refuse >> to start. Any other behavior will eventually cause errors. > > +1 in principal. Unfortunately on Windows it seems the default for > FD_SETSIZE is only 64. That's probably too small but it seems it is > allowed on Windows to increase this limit during compilation: > > <Quote> > The variable FD_SETSIZE determines the maximum number of descriptors in > a set. (The default value of FD_SETSIZE is 64, which can be modified by > defining FD_SETSIZE to another value before including Winsock2.h.) > Internally, socket handles in an fd_set structure are not represented as > bit flags as in Berkeley Unix. Their data representation is opaque. > </Quote>
That's ... weird. Okay. > So we should IMHO aim for > > a) check connection pool size against FD_SETSIZE and fail during startup > if too big - or we decrease it to the max value and log a warning? On *NIX, that value cannot reasonably be changed. I think we need to make all our decisions at compile-time and fail-fast at runtime. Lowering to a reasonable maximum value is probably okay. I'm not sure which would be worse: requiring the administrator to fix a configuration problem before the server can even start (imagine a server that's been working for years without this config, now it requires some change) or auto-reconfiguring based upon a value the admin hasn't set. Actually... in cases where this would have affected users, the result would have been that everything is fine until there is a buffer overflow. Hopefully, the buffer overflow is fatal, but it might not be. So, lowering to a smaller value if connection_pool_size is too big sounds good to me. Log with severity=WARN is a good option for notification. > b) define 1024 as the compile time FD_SETSIZE on Windows (same value as > the default e.g. on Linux and on 32 Bit Solaris). We already use 250 as > the default connection pool size. +1 > c) allow to increase FD_SETSIZE when building on Windows because it is > supported there. +1 We probably want something like JK_FD_SETSIZE defaults to 1024 and then FD_SETSIZE = JK_FD_SETSIZE in the build. I have absolutely no idea how on earth to do that for our Windows builds. > d) use the existing macro HAVE_POLL to offer a poll based code path if > poll was detected. I don't think HAVE_POLL is any kind of standard. I poked-around my Debian Linux environment and HAVE_POLL was defined in a number of header files, but it was unconditionally-defined to be "1" in files such as postgresql/pg_config.h, so I think the package-maintainers must have just said "this system has poll.h, let's just set HAVE_POLL=1 and call it a day". > Concerning a) the code that reads the pool size is in > common/jk_ajp_common.c in function ajp_init(): > > p->ep_cache_sz = jk_get_worker_cache_size(props, p->name, cache); > > The function gets a logger as an argument, so if we want we can easily > log stuff there. Correcting a cacxhe size that's too big to the max > value and log would be easiest. Terminating the startup is more > difficult. We do it e.g. for Apache using jk_error_exit() but it will be > a bit tedious to propagate the error from jk_ajp_common.c to mod_jk.c. > Furthermore you need a similar solution for ISAPI. Therefore I suggest > to choose the "correct the config and warn" attempt. > >> Thoughts? I like correct-and-warn for a number of reasons. So forgetting poll() for the time being, the correct-and-warn would just be a check for connection_pool_size > FD_SETSIZE, then set connection_pool_size = FD_SETSIZE and warn, right? Seems simple enough :) Thanks, -chris
signature.asc
Description: OpenPGP digital signature