On 10/08/13 13:41, Marc Glisse wrote:
Glad to see you checked this and have a test for it.
I am slowly starting to understand how reviewers think ;-)
That's a huge part of the submission process. Once you know what folks
are looking for the path to approval gets appropriately short :-)
Not required for approval, but an "extra credit" -- a warning if a
NULL value flows into a return statement in a function with this marking.
Similarly, not required for approval, but it'd be real cool if we
could back-propagate the non-null return value attribute. ie, any
value flowing into the return statement of one of these functions can
be assumed to be non-zero, which may help eliminate more null pointer
checks in the decorated function. I guess ultimately we'd have to see
if noting this actually helps any real code.
Also not required for approval, but adding returns_nonnull markups to
appropriate functions in gcc itself.
I completely agree that returns_nonnull has more uses. The warning is
the first one (not sure where to add such a check though), and maybe we
should have a sanitizer option to check the nonnull and returns_nonnull
attributes, although I don't know if that should be in the caller, in
the callee, or both.
Well, it seems to me there's both compile-time and runtime (sanitizer)
possibilities. I'm much more familiar with the compile-time analysis
and can comment on that much more substantially.
In a function with the attribute, what we want to know is if any NULL
values reach the return statement. Obviously if CCP/VRP/DOM manage to
propagate a NULL into a return statement, then we issue a "returns NULL"
warning, much like we do for "is used uninitialized".
The more interesting case is if a value is defined by a PHI. If a NULL
is found in the PHI argument list, then we'd want to use a "may return
NULL warning" much like we do for "may be used uninitialized". Note
this is true anytime we have a NULL value in a PHI arg and the result of
the PHI can ultimately reach a return statement.
If set of all values flowing to the return are SSA_NAMEs and range
information indicates some might be NULL, then the question is do we
issue a warning or not. My inclination would be yes, but we'd want it
to be separate from the earlier two cases due to the high noise ratio.
Note there is significant overlap with a queued enhancement of mine to
warn about potential null pointer dereferences -- which needs to work in
precisely the same manner and has the same problems with signal to noise
ratios, particularly in the 3rd case. Given that, I'd be happy to pull
that patch out of my stash and let you play with it if you're
interested. It hasn't been hacked on in a couple years, so it'd need
some updating.
In terms of exploiting the non-nullness, lots of fun here too.
For example, if CCP/VRP/DOM propagate a NULL into a return statement, a
program executing that code should be declared as having undefined
behaviour. Assuming we do that, then we change the return 0 into a trap.
If there is a PHI argument with a NULL value that then feeds a return
statement, we should isolate that path by block duplication. Once the
path is isolated, it turns into the former case and we apply the same
transformation.
As it turns out I'm working right now on cleaning up a change to do
precisely that for cases where a NULL value feeds into a pointer
dereference :-)
Finally, backpropagation. Any SSA_NAME that directly or indirectly
reaches a return statement in one of these decorated functions has a
known non-zero value. It shouldn't be too hard to teach that to VRP
which will then use that information to do more aggressive NULL pointer
elimination.
From an optimization POV, I guess there is also the inlining that could
be improved so it doesn't lose the nonnull property.
The main request I am aware of is PR21856, but I am not familiar enough
with java to handle it. Do you have suggestions of functions which
should get the attribute? memcpy could, but that's really a special case
since it returns its argument, which is itself non-null.
The most obvious are things like xmalloc and wrappers around it,
probably the GC allocation routines and wrappers around those. Maybe
some of the str* and mem* functions as well, I'd have to look at the ISO
specs to be sure though.
You end up wanting to do a transitive closure on whatever initial set of
non-null functions you come up with and any functions which call them
and return those results unconditionally. You could even build a
warning around that -- ie, "is missing non-null return attribute" kind
of warning when you can prove that all values feeding the return
statement are non-null by whatever means are available to you.
With regard to 21856, the caveat I would mention for that is java as a
GCC front-end is becoming less and less important over time. So I'd
suggest focusing on how this can be used for the more important
languages and if getting the optimization for Java happens in the
process, great, but I wouldn't jump through any hoops specifically for java.
I'll open a new PR about all these. I mostly implemented returns_nonnull
because I'd just done the same for operator new and thus I knew exactly
where the optimization was, I can't promise I'll manage much of the rest.
Understood completely. Fine with me.
jeff