Hey Sarah,

Many array bounds and format string problems can already be found, especially 
with LTO, ClooG, loop-unrolling, and -O3 enabled. Seeing across object-file 
boundaries, understanding loop boundaries, and aggressive inlining allows GCC 
to warn about a lot of real-world vulnerabilities. When multiple IPA passes 
lands in trunk, it should be even better.

What I think is missing is:

1) detection of double-free. This is already a function attribute called 
'malloc', which is used to express a specific kind of allocation function whose 
return value will never be aliased. You could use that attribute, in addition 
to a new one ('free'), to track potential double-frees of values via VRP/IPA.

2) the ability to annotate functions as to the taint and filtering side-effects 
to their parameters, like the format() attribute. (I've asked for this feature 
from the PC-Lint people for some time.) You could make this even more generic 
and just add a new attribute that allows for tagging and checking of arbitrary 
tags:
ssize_t recv(int sockfd, void *buf, size_t len, int flags) __attribute__ 
((add_parameter_tag ("taint", 2)))
                                                           __attribute__ 
((add_return_value_tag ("taint")));

int count_sql_rows_for(const char* name) __attribute__ ((disallow_parameter_tag 
("taint", 1)));
void filter_sql_characters_from(const char* name) __attribute__ 
((removes_parameter_tag ("taint", 1)));

then a program like this:
int main(void) {
  char name[20] = {0};
  recv(GLOBAL_SOCKET, &name, sizeof(name), 0);
  filter_sql_characters_from(name); // comment this line to get warning
  count_sql_rows_for(name);
}

When I wrote my binary static analysis product, BugScan, we assumed that if a 
pointer was tainted, so was its contents. (This was especially a necessity for 
collections like lists and vectors in Java and C++ binaries.) You may want to 
get more explicit with that, by having a rescurively_add_parameter_tag() or 
somesuch that only applies to pointer parameters.

3) lack of explicit NULL-termination of strings. This one gets really 
complicated, especially for situations where they are terminated properly and 
then become un-terminated.

4) if a loop that writes to a pointer, and increments that pointer, is bound by 
a tainted value. You'd have to add an extension to the loop unroller for that, 
and just check for the 'taint' tag on the bounds check.


Of course, you still run into temporal ordering issues, especially with 
globals, where the CFG ordering won't help.

But don't let that discourage you -- it would be great work to see done and 
commoditized, and would probably be better than most commercial analyzers as 
well ;)

Let me know if you need any more of my expertise in this area. I can't speak 
for GCC internals, though.


Reply via email to