Rainer,

On 7/18/16 5:48 PM, Rainer Jung wrote:
> Hi Chris,
> 
> thanks for picking this topic.
> 
> Am 18.07.2016 um 17:02 schrieb Christopher Schultz:
>> All,
>>
>> Michael Deiner found a buffer overflow in the call to FD_SET macro on
>> line 291 of jk_connect.c:
>>
>> 280>   do {
>> 281>        rc = connect(sd, (const struct sockaddr *)&addr->sa.sin,
>> addr->salen);
>> 282>    } while (rc == -1 && errno == EINTR);
>> 283>
>> 284>    if ((rc == -1) && (errno == EINPROGRESS || errno == EALREADY)
>> 285>                   && (timeout > 0)) {
>> 286>        fd_set wfdset;
>> 287>        struct timeval tv;
>> 288>        socklen_t rclen = (socklen_t)sizeof(rc);
>> 289>
>> 290>        FD_ZERO(&wfdset);
>> *291>        FD_SET(sd, &wfdset);*
>> 292>        tv.tv_sec = timeout / 1000;
>> 293>        tv.tv_usec = (timeout % 1000) * 1000;
>> 294>        rc = select(sd + 1, NULL, &wfdset, NULL, &tv);
>>
>> I'd like to fix this so it won't bring-down the server :)
>>
>> But it quickly gets complicated.
>>
>> The method itself takes "sd" (a jk_sock_t) as an argument to the
>> function, and we can check immediately whether it will cause FD_SET to
>> overflow -- easy: just check to see if the value is too large -- but
>> what should we do in that case?
>>
>> This function should be connecting to a back-end Tomcat server, but if
>> we have too many outbound connections, we'll fail.
>>
>> I'm not sure it makes any sense to let things get this far.
>>
>> The proposed solution[1] is to use poll() instead of select(), but that
>> won't work on every platform, plus I'd like to be able to fix the buffer
>> overflow right away while we work-out a solution for poll() that will
>> work in most environments.
>>
>> I think if the connection_pool_size exceeds FD_SETSIZE we should refuse
>> to start. Any other behavior will eventually cause errors.
> 
> +1 in principal. Unfortunately on Windows it seems the default for
> FD_SETSIZE is only 64. That's probably too small but it seems it is
> allowed on Windows to increase this limit during compilation:
> 
> <Quote>
> The variable FD_SETSIZE determines the maximum number of descriptors in
> a set. (The default value of FD_SETSIZE is 64, which can be modified by
> defining FD_SETSIZE to another value before including Winsock2.h.)
> Internally, socket handles in an fd_set structure are not represented as
> bit flags as in Berkeley Unix. Their data representation is opaque.
> </Quote>

That's ... weird. Okay.

> So we should IMHO aim for
> 
> a) check connection pool size against FD_SETSIZE and fail during startup
> if too big - or we decrease it to the max value and log a warning?

On *NIX, that value cannot reasonably be changed. I think we need to
make all our decisions at compile-time and fail-fast at runtime.

Lowering to a reasonable maximum value is probably okay. I'm not sure
which would be worse: requiring the administrator to fix a configuration
problem before the server can even start (imagine a server that's been
working for years without this config, now it requires some change) or
auto-reconfiguring based upon a value the admin hasn't set.

Actually... in cases where this would have affected users, the result
would have been that everything is fine until there is a buffer
overflow. Hopefully, the buffer overflow is fatal, but it might not be.

So, lowering to a smaller value if connection_pool_size is too big
sounds good to me. Log with severity=WARN is a good option for notification.

> b) define 1024 as the compile time FD_SETSIZE on Windows (same value as
> the default e.g. on Linux and on 32 Bit Solaris). We already use 250 as
> the default connection pool size.

+1

> c) allow to increase FD_SETSIZE when building on Windows because it is
> supported there.

+1

We probably want something like JK_FD_SETSIZE defaults to 1024 and then
FD_SETSIZE = JK_FD_SETSIZE in the build. I have absolutely no idea how
on earth to do that for our Windows builds.

> d) use the existing macro HAVE_POLL to offer a poll based code path if
> poll was detected.

I don't think HAVE_POLL is any kind of standard. I poked-around my
Debian Linux environment and HAVE_POLL was defined in a number of header
files, but it was unconditionally-defined to be "1" in files such as
postgresql/pg_config.h, so I think the package-maintainers must have
just said "this system has poll.h, let's just set HAVE_POLL=1 and call
it a day".

> Concerning a) the code that reads the pool size is in
> common/jk_ajp_common.c in function ajp_init():
> 
> p->ep_cache_sz = jk_get_worker_cache_size(props, p->name, cache);
> 
> The function gets a logger as an argument, so if we want we can easily
> log stuff there. Correcting a cacxhe size that's too big to the max
> value and log would be easiest. Terminating the startup is more
> difficult. We do it e.g. for Apache using jk_error_exit() but it will be
> a bit tedious to propagate the error from jk_ajp_common.c to mod_jk.c.
> Furthermore you need a similar solution for ISAPI. Therefore I suggest
> to choose the "correct the config and warn" attempt.
> 
>> Thoughts?

I like correct-and-warn for a number of reasons.

So forgetting poll() for the time being, the correct-and-warn would just
be a check for connection_pool_size > FD_SETSIZE, then set
connection_pool_size = FD_SETSIZE and warn, right?

Seems simple enough :)

Thanks,
-chris

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to