On 4/29/21 11:18 AM, David Malcolm wrote:
I've been going through old Linux kernel CVEs, trying to prototype some
possible new warnings for -fanalyzer in GCC 12 (and, alas, finding
places where the analyzer internals need work...)

I think I want a way for the user to be able to mark security
boundaries in their code: for example:
* in the Linux kernel the boundary between untrusted user-space data
and kernel data, or,
* for a user-space daemon, the boundary between data coming from the
network and the data of daemon itself

The analyzer could then make use of this, for example:

(a) marking untrusted incoming data as "tainted" and prioritizing
analysis of paths that make use of it (e.g. a "might overflow a buffer
when N is really large" goes from being a noisy false positive when we
simply have no knowledge of N (or the buffer's size) to being a serious
issue if N is under the control of an attacker

(b) copying uninitialized data back to the untrusted region becomes a
potential disclosure of sensitive information

I think I also want a way to mark system calls and ioctl
implementations, so that I mark all of the parameters as being
potentially hostile.

Specifically, the Linux kernel uses functions like this:
   #define __user
   extern long copy_to_user(void __user *to, const void *from, unsigned long n);
   extern long copy_from_user(void *to, const void __user *from, long n);

in various places, so I want a way to mark the "to" and "from" params
as being a security boundary.

I've been experimenting with implementing (b) for CVE-2011-1078 (in
which a copy_to_user is passed a pointer to an on-stack buffer that
isn't fully initialized, hence a disclosure of information to user-
space).

Martin: I believe you added __attribute__((access)) in GCC 9.

I was thinking of extending it to allow something like:

#define __user
extern long copy_to_user(void __user *to, const void *from, unsigned long n)
   __attribute__((access (untrusted_write, 1, 3),
                 access (read_only, 2, 3)
                 ));

extern long copy_from_user(void *to, const void __user *from, long n)
   __attribute__((access (write_only, 1, 3),
                 access (untrusted_read, 2, 3)
                 ));

so that to "to" and "from" and marked as being writes and reads of up
to size n, but they are flagged as "untrusted" as appropriate, so the
analyzer can pay particular attention as described above.

Does the above idea sound like a sane extension of the access
attribute?

The trust aspect seems orthogonal to the access mode, so my initial
thought is if it should be a separate attribute.  That would also
make it applicable to other sources of data such as function return
values, or namespace scope variables.

At the same time, I have been thinking of extending attribute access
to these contexts as well: e.g., to denote that a function like
getenv() returns a pointer to a non-modifiable string, or that
the array pointed to by a variable like environ should not be
directly modified by a program.  So maybe it's not unreasonable
to integrate the two.


I tried implementing it, but "access" seems to get converted to its own
microformat for expressing these things as strings (created via
append_access_attr, and parsed in e.g. init_attr_rdwr_indices), which
seems to make it much harder than I was expecting.

The translation was done to address a concern about the cost of
parsing of the new attribute with all its variations.  I suspect
Richard (who raised the concern) might have been envisioning
a format similar to attribute fn spec.  That's what I aimed
for in any event.  But attribute access is more general and
cab be dynamically extended and so it's more complex, and
so the result is probably not quite what he had in mind.

The translation from the human readable form to the condensed
internal form is tricky, a source of bugs, and a royal PITA to
work with.  I'd be happy to go back to the original design (but
I'm not terribly motivated to do the work myself to change it).

That said. I wouldn't expect adding a new mode to be too bad.
Just add the handling to handle_access_attribute and the names
and the corresponding letters to attr_access.


Any thoughts about how to mark system calls/ioctls?  The simplest would
be an attribute that marks all parameters as being untrusted, and the
return value, somehow.

I'm not sure I understand the question but if you're wondering
about what to do about data coming in and out of variadic functions
with arbitrary orders of arguments of arbitrary types I don't really
have any thoughts on how to deal with those in general.

Martin


Thanks
Dave


Reply via email to