On 11/18/20 1:41 PM, David Malcolm wrote:
On Mon, 2020-11-16 at 17:49 -0700, Martin Sebor wrote:
On 11/13/20 2:44 PM, Jeff Law via Gcc-patches wrote:
On 10/5/20 5:12 PM, David Malcolm via Gcc-patches wrote:
This work-in-progress patch generalizes the malloc/free problem-
checking
in -fanalyzer so that it can work on arbitrary acquire/release
API pairs.

It adds a new __attribute__((deallocated_by(FOO))) that could be
used
like this in a library header:

    struct foo;

    extern void foo_release (struct foo *);

    extern struct foo *foo_acquire (void)
      __attribute__ ((deallocated_by(foo_release)));

In theory, the analyzer then "knows" these functions are an
acquire/release pair, and can emit diagnostics for leaks, double-
frees,
use-after-frees, mismatching deallocations, etc.

My hope was that this would provide a minimal level of markup
that would
support library-checking without requiring lots of further
markup.
I attempted to use this to detect a memory leak within a Linux
driver (CVE-2019-19078), by adding the attribute to mark these
fns:
    extern struct urb *usb_alloc_urb(int iso_packets, gfp_t
mem_flags);
    extern void usb_free_urb(struct urb *urb);
where there is a leak of a "urb" on an error-handling path.
Unfortunately I ran into the problem that there are various other
fns
that take "struct urb *" and the analyzer conservatively assumes
that a
urb passed to them might or might not be freed and thus stops
tracking
state for them.

So I don't know how much use this feature would be as-is.
(without either requiring lots of additional attributes for
marking
fndecl args as being merely borrowed, or simply assuming that
they
are borrowed in the absence of a function body to analyze)

Thoughts?
Dave

gcc/analyzer/ChangeLog:
        * region-model-impl-calls.cc
        (region_model::impl_deallocation_call): New.
        * region-model.cc: Include "attribs.h".
        (region_model::on_call_post): Handle fndecls referenced by
        __attribute__((deallocated_by(FOO))).
        * region-model.h (region_model::impl_deallocation_call): New
decl.
        * sm-malloc.cc: Include "stringpool.h" and "attribs.h".
        (enum wording): Add WORDING_DEALLOCATED.
        (malloc_state_machine::custom_api_map_t): New typedef.
        (malloc_state_machine::m_custom_apis): New field.
        (start_p): New.
        (use_after_free::describe_state_change): Handle
        WORDING_DEALLOCATED.
        (use_after_free::describe_final_event): Likewise.
        (malloc_leak::describe_state_change): Only emit "allocated
here" on
        a start->nonnull transition, rather than on other transitions
to
        nonnull.
        (malloc_state_machine::~malloc_state_machine): New.
        (malloc_state_machine::on_stmt): Handle
        "__attribute__((deallocated_by(FOO)))", and the special
attribute
        set on FOO.
        (malloc_state_machine::get_or_create_api): New.
        (malloc_state_machine::on_allocator_call): Add
"returns_nonnull"
        param and use it to affect which state to transition to.

gcc/c-family/ChangeLog:
        * c-attribs.c (c_common_attribute_table): Add entry for
        "deallocated_by".
        (matching_deallocator_type_p): New.
        (maybe_add_deallocator_attribute): New.
        (handle_deallocated_by_attribute): New.

gcc/ChangeLog:
        * doc/extend.texi (Common Function Attributes): Add
        "deallocated_by".

gcc/testsuite/ChangeLog:
        * gcc.dg/analyzer/attr-deallocated_by-1.c: New test.
        * gcc.dg/analyzer/attr-deallocated_by-1a.c: New test.
        * gcc.dg/analyzer/attr-deallocated_by-2.c: New test.
        * gcc.dg/analyzer/attr-deallocated_by-3.c: New test.
        * gcc.dg/analyzer/attr-deallocated_by-4.c: New test.
        * gcc.dg/analyzer/attr-deallocated_by-CVE-2019-19078-usb-
leak.c:
        New test.
        * gcc.dg/analyzer/attr-deallocated_by-misuses.c: New test.

I'd probably go with something more like acquire/release since I
think
the same concepts apply to things like file descriptors acquired by
open
and released by close.  I think the basic concept makes sense and
would
be useful, so I'd lean towards moving forward even if it hasn't
been
particularly useful for the analyzer yet.  One could even ponder
propagation of the attribute similar to what we do with const/pure
so
that we could see through wrappers without the user having to do
more
markup.


What I wonder here is whether or not Martin's work could take
advantage
of the attribute.   I don't see that as strictly necessary for the
patch
to move forward, just a question we should try to answer.

It could, but it would need at least one change.  The patch I posted
on Friday introduces a similar attribute:
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559053.html

The main differences between the two are that deallocated_by works
on integers in addition to pointers, but doesn't support positional
arguments for the deallocator.  The work I submitted has no support
for integers (meaning I couldn't support the attribute for those
APIs even if I had an attribute for that) but for completeness needs
the positional argument (it's also what the kernel developers asked
for).

I propose we introduce two attributes with different names: one
for pointers and another for integers (and perhaps other kinds of
handles in the future).  The analyzer would handle both, the rest
of GCC just the pointer variety (at least for now).

I don't think introducing two different attributes for the same
thing (i.e., for pointer APIs), one for the analyzer and another
for GCC warnings, would be helpful to users.

I spent some time today porting my patch to sit on top of Martin's,
eliminating the "deallocated_by" attribute in favor of the "malloc"
attribute.

It works (though I haven't yet implemented the optional param index,
but I think that won't be hard).

I like the idea of introducing a separate attribute for marking non-
pointer acquire/release pairs; I can rework my patch to do that.

Martin's work supports multiple deallocators, which mine doesn't (but
probably can with a little reworking).

Great!  I'm glad to hear you didn't hit any roadblocks.


I'm not in love with "__attribute__((malloc (FOO)))", in that I thought
"deallocated_by" was clearer - but it's not a dealbreaker for me.  It's
kind of hidden in the testcases by the way Martin used macros for the
attribute.

I don't insist on the name.  I reused malloc mainly because it's
already there, seemed like a good fit, and I recall Richi being
concerned about the costs of parsing too many attributes.

One of my testcases detected mismatching types, which the malloc
attribute handler doesn't:

/* Mismatching types.  */
struct foo {};
struct bar {};
extern void takes_foo (struct foo *); /* { dg-message "\\.\\.\\.whereas 'takes_foo' 
accepts type 'struct foo \\*'" } */
struct bar *wrong_deallocator_type (void)
   __attribute__ ((malloc(takes_foo))); /* { dg-error "'malloc' attribute applied to 
function returning type 'struct bar \\*'\\.\\.\\." } */

Are we allowing users to be flexible about this, or should we implement
type checking in the malloc attribute handler?
(see matching_deallocator_type_p in my patch for how I did it for
deallocated_by).

I briefly considered but decided not to implement any type checking.
I figure there could be allocators for multiple types of objects all
associated with the same deallocator.  This could be not just T* to
void*, but in C++ also Derived* to Base*.  The latter could be
accommodated as well with some effort, it just didn't seem worth
the additional complexity.

So hopefully that gives us a way forward.  I'm about to disappear for a
week and a half, so don't let my analyzer patches stand in the way of
Martin's.  I can finish reworking my stuff on top of Martin's when I
get back, or if they aren't in "master" yet I can combine parts of
Martin's patches and integrate them into mine.

Hope this sounds sane

It does to me.  Thanks again for taking the time to do the prototype
and confirm there are no roadblocks for sharing the same attribute!

Martin

Dave

Martin


So I don't mind seeing it go forward.  I leave it as your call.


jeff




Reply via email to