On 4/29/21 11:18 AM, David Malcolm wrote:
I've been going through old Linux kernel CVEs, trying to prototype some possible new warnings for -fanalyzer in GCC 12 (and, alas, finding places where the analyzer internals need work...)I think I want a way for the user to be able to mark security boundaries in their code: for example: * in the Linux kernel the boundary between untrusted user-space data and kernel data, or, * for a user-space daemon, the boundary between data coming from the network and the data of daemon itself The analyzer could then make use of this, for example: (a) marking untrusted incoming data as "tainted" and prioritizing analysis of paths that make use of it (e.g. a "might overflow a buffer when N is really large" goes from being a noisy false positive when we simply have no knowledge of N (or the buffer's size) to being a serious issue if N is under the control of an attacker (b) copying uninitialized data back to the untrusted region becomes a potential disclosure of sensitive information I think I also want a way to mark system calls and ioctl implementations, so that I mark all of the parameters as being potentially hostile. Specifically, the Linux kernel uses functions like this: #define __user extern long copy_to_user(void __user *to, const void *from, unsigned long n); extern long copy_from_user(void *to, const void __user *from, long n); in various places, so I want a way to mark the "to" and "from" params as being a security boundary. I've been experimenting with implementing (b) for CVE-2011-1078 (in which a copy_to_user is passed a pointer to an on-stack buffer that isn't fully initialized, hence a disclosure of information to user- space). Martin: I believe you added __attribute__((access)) in GCC 9. I was thinking of extending it to allow something like: #define __user extern long copy_to_user(void __user *to, const void *from, unsigned long n) __attribute__((access (untrusted_write, 1, 3), access (read_only, 2, 3) )); extern long copy_from_user(void *to, const void __user *from, long n) __attribute__((access (write_only, 1, 3), access (untrusted_read, 2, 3) )); so that to "to" and "from" and marked as being writes and reads of up to size n, but they are flagged as "untrusted" as appropriate, so the analyzer can pay particular attention as described above. Does the above idea sound like a sane extension of the access attribute?
The trust aspect seems orthogonal to the access mode, so my initial thought is if it should be a separate attribute. That would also make it applicable to other sources of data such as function return values, or namespace scope variables. At the same time, I have been thinking of extending attribute access to these contexts as well: e.g., to denote that a function like getenv() returns a pointer to a non-modifiable string, or that the array pointed to by a variable like environ should not be directly modified by a program. So maybe it's not unreasonable to integrate the two.
I tried implementing it, but "access" seems to get converted to its own microformat for expressing these things as strings (created via append_access_attr, and parsed in e.g. init_attr_rdwr_indices), which seems to make it much harder than I was expecting.
The translation was done to address a concern about the cost of parsing of the new attribute with all its variations. I suspect Richard (who raised the concern) might have been envisioning a format similar to attribute fn spec. That's what I aimed for in any event. But attribute access is more general and cab be dynamically extended and so it's more complex, and so the result is probably not quite what he had in mind. The translation from the human readable form to the condensed internal form is tricky, a source of bugs, and a royal PITA to work with. I'd be happy to go back to the original design (but I'm not terribly motivated to do the work myself to change it). That said. I wouldn't expect adding a new mode to be too bad. Just add the handling to handle_access_attribute and the names and the corresponding letters to attr_access.
Any thoughts about how to mark system calls/ioctls? The simplest would be an attribute that marks all parameters as being untrusted, and the return value, somehow.
I'm not sure I understand the question but if you're wondering about what to do about data coming in and out of variadic functions with arbitrary orders of arguments of arbitrary types I don't really have any thoughts on how to deal with those in general. Martin
Thanks Dave
