On 07/27/2018 12:48 AM, Bernd Edlinger wrote:
I have one more example similar to PR86259, that resembles IMHO real world code:Consider the following: int fun (char *p) { char buf[16]; assert(strlen(p) < 4); //here: security relevant check sprintf(buf, "echo %s - %s", p, p); //here: security relevant code return system(buf); } What is wrong with the assertion? Nothing, except it is removed, when this function is called from untrusted code: untrused_fun () { char b[2] = "ab"; fun(b); } !!!! don't try to execute that: after "ab" there can be "; rm -rF / ;" on your stack!!!! Even the slightly more safe check "assert(strnlen(p, 4) < 4);" would have been removed. Now that is a simple error and it would be easy to fix -- normally. But when the assertion is removed, the security relevant code is allowed to continue where it creates more damage and is suddenly much harder to debug.
sprintf() is a known source of buffer overflows. The recommended practice is to use snprintf. An alternate mechanism to constrain the number of bytes formatted by an individual %s directive is to use the precision, such as %.4s.
So, I start to believe that strlen range assumptions are unsafe, unless we can prove that the string is in fact zero terminated. I would like to guard the strlen range checks with a new option, maybe -fassume-zero-terminated-char-arrays, and enable that under -Ofast only. What do you think?
I'm not opposed to providing options to control various features but I'm not in favor of disabling them by default as a solution to accommodate buggy code. For every instance of a bug in a program with undefined behavior, whether it's reading or writing past the end of an object or subobject, or integer overflow, it's possible to show security-related consequences. One could just as easily create a test case where allowing strlen to read past the end of a member array could be exploited to cause a subsequent buffer overflow. Some of these consequences might be in some cases mitigated by one strategy and others in other cases by another. There's no silver bullet -- the best approach is to drive improvements to code to help weed out these bugs. Even without _FORTIFY_SOURCE GCC diagnoses (some) writes past the end of subobjects by string functions. With _FORTIFY_SOURCE=2 it calls abort. This is the default on popular distributions, including Fedora, RHEL, and Ubuntu. -Wstringop-truncation tries to help detect the creation of unterminated strings by strncpy and strncat. There is little reason in my mind to treat strlen or any other function as special, except perhaps for the few existing exceptions of the raw memory functions (memcpy, et al.) As you know, I have already posted a patch to detect a subset of the problem of calling strlen on non-terminated arrays. More such issues, including uses of dynamically created and uninitialized arrays, can be detected by relatively modest enhancements to the tree-ssa-strlen pass (also on my list of things to do). It may also be worth considering moving the "initializer-string for array chars is too long" warning from -Wc++-compat to -Wall or -Wextra. But I would much rather focus on these solutions and work toward overall improvements than on weakening optimization to accommodate undefined code. With sufficient awareness as a result of warnings such code should all but disappear. Following stricter rules opens up opportunities for deeper analyses to enable more optimization and detect even more bugs. Martin
