On Wed Jun 29, 2022 at 7:39 PM CEST, David Malcolm wrote:
> On Wed, 2022-06-29 at 17:39 +0200, Tim Lange wrote:
>
> > Hi,
>
> Thanks for the updated patch.
>
> Overall, looks nearly ready; various nits inline below, throughout...
>
> > 
> > I've addressed most of the points from the review.
> > * The allocation size warning warns whenever region_model::get_capacity 
> > returns
> > something useful, i.e. also on statically-allocated regions.
>
> Thanks.  Looks like you added test coverage for this in allocation-
> size-5.c
>
> > * I added a new virtual function to the pending-diagnostic class, so
> that it
> > is possible to emit a custom region creation description.
> > * The test cases should have a better coverage now.
> > * Conservative struct handling
> > 
> > The warning now looks like this:
> > /path/to/main.c:9:8: warning: allocated buffer size is not a multiple of 
> > the pointee's size [CWE-131] [-Wanalyzer-allocation-size]
> >     9 |   int *iptr = ptr;
> >       |        ^~~~
> >   ‘main’: events 1-2
> >     |
> >     |    8 |   void *ptr = malloc((long unsigned int)n * sizeof(short));
> >     |      |               ^~~~~~~~~~~~~~~~~~~~~~~~~
> >     |      |               |
> >     |      |               (1) allocated ‘(long unsigned int)n * 2’ bytes 
> > here
> >     |    9 |   int *iptr = ptr;
> >     |      |        ~~~~    
> >     |      |        |
> >     |      |        (2) assigned to ‘int *’ here; ‘sizeof(int)’ is ‘4’
> >     |
>
> Looks great.
>
> > 
> > /path/to/main.c:15:15: warning: allocated buffer size is not a multiple of 
> > the pointee's size [CWE-131] [-Wanalyzer-allocation-size]
> >    15 |   int *ptrw = malloc (sizeof (short));
> >       |               ^~~~~~~~~~~~~~~~~~~~~~~
> >   ‘main’: events 1-2
> >     |
> >     |   15 |   int *ptrw = malloc (sizeof (short));
> >     |      |               ^~~~~~~~~~~~~~~~~~~~~~~
> >     |      |               |
> >     |      |               (1) allocated ‘2’ bytes here
>
> Looks a bit weird to be quoting a number here; maybe whenever the
> expression is just a constant, print it unquoted?  (though that could
> be fiddly to implement, so can be ignored if it turns out to be) .

Isn't the 'q' in '%qE' responsible for quoting. Using '%E' instead if
m_expr is an INTEGER_CST works.

Otherwise, I've left the quoting on the "'sizeof (int)' is '4'" note.
I do think thata looks better than without.

/path/to/main.c:16:15: warning: allocated buffer size is not a multiple of the 
pointee's size [CWE-131] [-Wanalyzer-allocation-size]
   16 |   int *ptrw = malloc (21 * sizeof (short));
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
  ‘main’: events 1-2
    |
    |   16 |   int *ptrw = malloc (21 * sizeof (short));
    |      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |               |
    |      |               (1) allocated 42 bytes here
    |      |               (2) assigned to ‘int *’ here; ‘sizeof (int)’ is ‘4’
    |

>
>
> >     |      |               (2) assigned to ‘int *’ here; ‘sizeof (int)’ is 
> > ‘4’
> >     |
> > The only thing I couldn't address was moving the second event toward the 
> > lhs or
> > assign token here. I tracked it down till get_stmt_location where it seems 
> > that
> > the rhs is actually the location of the statement. Is there any way to get 
> > (2)
> > to be focused on the lhs?
>
> Annoyingly, we've lost a lot of location information by the time the
> analyzer runs.
>
> In theory we could special-case for when we have the def-stmt of the
> SSA_NAME that's that default (i.e. initial) value of a VAR_DECL, and if
> we see the write is there, we could use the DECL_SOUCE_LOCATION of the
> VAR_DECL for the write, so that we'd get:
>
>     |   15 |   int *ptrw = malloc (sizeof (short));
>     |      |        ^~~~   ^~~~~~~~~~~~~~~~~~~~~~~
>     |      |        |      |
>     |      |        |      (1) allocated ‘2’ bytes here
>     |      |        (2) assigned to ‘int *’ here; ‘sizeof (int)’ is ‘4’
>     |
>
> which is perhaps slightly more readable.  I'm not sure it's worth it
> though.

Hm, okay. I've left that out for now.

>
> > 
> > Otherwise, the patch compiled coreutils, openssh, curl and httpd without any
> > false-positive (but none of them contained a bug found by the checker 
> > either).
>
> Great.
>
> > `make check-gcc RUNTESTFLAGS="analyzer.exp"` tests pass and as I just 
> > worked on
> > the event splitting, the regression tests are yet to run.
> > 
> > - Tim
> > 
> > 
> > This patch adds an checker that warns about code paths in which a buffer is
> > assigned to a incompatible type, i.e. when the allocated buffer size is not 
> > a
> > multiple of the pointee's size.
> > 
> > gcc/analyzer/ChangeLog:
>
> You should add a reference to the RFE bug to the top of the ChangeLog entries:
>           PR analyzer/105900
>
> Please also add it to the commit message, in the form " [PR105900]";
> see the examples section twoards the end of
> https://gcc.gnu.org/contribute.html#patches
>
>
> > 
> >         * analyzer.opt: Added Wanalyzer-allocation-size.
>
> [...snip...]
>
> > 
> > gcc/ChangeLog:
>
> ...and here
>
> > 
> >         * doc/invoke.texi: Added Wanalyzer-allocation-size.
> > 
> > gcc/testsuite/ChangeLog:
>
> ...and here
>
> > 
> >         * gcc.dg/analyzer/pr96639.c: Changed buffer size to omit warning.
> >         * gcc.dg/analyzer/allocation-size-1.c: New test.
> >         * gcc.dg/analyzer/allocation-size-2.c: New test.
> >         * gcc.dg/analyzer/allocation-size-3.c: New test.
> >         * gcc.dg/analyzer/allocation-size-4.c: New test.
> >         * gcc.dg/analyzer/allocation-size-5.c: New test.
> > 
> > Signed-off-by: Tim Lange <m...@tim-lange.me>
>
> [...snip...]
>
> > diff --git a/gcc/analyzer/analyzer.opt b/gcc/analyzer/analyzer.opt
> > index 4aea52d3a87..912def2faf2 100644
> > --- a/gcc/analyzer/analyzer.opt
> > +++ b/gcc/analyzer/analyzer.opt
> > @@ -54,6 +54,10 @@ The minimum number of supernodes within a function for 
> > the analyzer to consider
> >  Common Joined UInteger Var(param_analyzer_max_enodes_for_full_dump) 
> > Init(200) Param
> >  The maximum depth of exploded nodes that should appear in a dot dump 
> > before switching to a less verbose format.
> >  
> > +Wanalyzer-allocation-size
> > +Common Var(warn_analyzer_allocation_size) Init(1) Warning
> > +Warn about code paths in which a buffer is assigned to a incompatible type.
>
> Reword "buffer" to "pointer to a buffer", I think.
>
> "a incompatible" -> "an incompatible"
>
> [...snip...]
>
> > +/* Concrete subclass for casts of pointers that lead to trailing bytes.  */
> > +
> > +class dubious_allocation_size
> > +: public pending_diagnostic_subclass<dubious_allocation_size>
> > +{
> > +public:
> > +  dubious_allocation_size (const region *lhs, const region *rhs)
> > +  : m_lhs (lhs), m_rhs (rhs), m_expr (NULL_TREE)
> > +  {}
>
> [...snip...]
>
> > +  bool operator== (const dubious_allocation_size &other) const
> > +  {
> > +    return m_lhs == other.m_lhs && m_rhs == other.m_rhs;
>
> Probably should also check that:
>   same_tree_p (m_expr, other.m_expr);
>
> [...snip...]
>
> > +/* Return true on dubious allocation sizes for constant sizes.  */
> > +
> > +static bool
> > +capacity_compatible_with_type (tree cst, tree pointee_size_tree,
> > +                          bool is_struct)
> > +{
> > +  unsigned HOST_WIDE_INT pointee_size = TREE_INT_CST_LOW 
> > (pointee_size_tree);
> > +  if (pointee_size == 0)
> > +    return 0;
>
> "false" rather than 0, given that this is bool.
>
> [...snip...]
>
> > +/* Return true if the lhs and rhs of an assignment have different types.  
> > */
> > +
> > +static bool
> > +is_any_cast_p (const gimple *stmt)
> > +{
> > +  if (const gassign *assign = dyn_cast<const gassign *>(stmt))
> > +    return gimple_assign_cast_p (assign)
> > +     || (gimple_num_ops (assign) == 2
> > +         && !pending_diagnostic::same_tree_p (
> > +                               TREE_TYPE (gimple_assign_lhs (assign)),
> > +                               TREE_TYPE (gimple_assign_rhs1 (assign))));
>
> The "== 2" subclause in the above condition doesn't look quite right to
> me; what statements did you encounter that needed this?

I wanted to ensure that I do actually have a pointer on the rhs. In the
patch this is already ensured by only handling region_svals. Thus, I
removed the check.

>
> [...snip...]
>
> > @@ -9758,6 +9759,18 @@ By default, the analysis silently stops if the code 
> > is too
> >  complicated for the analyzer to fully explore and it reaches an internal
> >  limit.  The @option{-Wanalyzer-too-complex} option warns if this occurs.
> >  
> > +@item -Wno-analyzer-allocation-size
> > +@opindex Wanalyzer-allocation-size
> > +@opindex Wno-analyzer-allocation-size
> > +This warning requires @option{-fanalyzer}, which enables it; use
> > +@option{-Wno-analyzer-allocation-size}
> > +to disable it.
> > +
> > +This diagnostic warns for paths through the code in which a buffer is 
> > casted
> > +to a type where the buffer size is not a multiple of the pointee size.
>
> At the risk of bikeshedding, how about:
>
> This diagnostic warns for paths through the code in which a pointer
> is assigned to point at a buffer with a size that is not a multiple
> of sizeof(*pointer).
>
> See @url{https://cwe.mitre.org/data/definitions/131.html, CWE-131: Incorrect 
> Calculation of Buffer Size}.
>
>
> [...snip...]
>
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c 
> > b/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c
> > new file mode 100644
> > index 00000000000..02634ae883b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-1.c
> > @@ -0,0 +1,102 @@
> > +#include <stdlib.h>
> > +#include <stdio.h>
> > +
> > +/* Tests with constant buffer sizes.  */
> > +
> > +void test_1 (void)
> > +{
> > +  short *ptr = malloc (21 * sizeof (short));
> > +  free (ptr);
> > +}
> > +
> > +void test_2 (void)
> > +{
> > +  int *ptr = malloc (21 * sizeof (short)); /* { dg-line malloc2 } */
> > +  free (ptr);
> > +
> > +  /* { dg-warning "" "" { target *-*-* } malloc2 } */
> > +  /* { dg-message "" "" { target *-*-* } malloc2 } */
>
> The various dg-warning and dg-message directives here (and throughout
> the rest of the patch) shouldn't have just "" "" for their first two
> args.
>
> The first arg should be a regexp that matches some (nonempty) subset of
> the expected text.  There's a balance to be struck between:
> (a) terseness to avoid "gold plating" the test output (where making any
> change to the wording would involve lots of tedious updates to test
> directives)
> versus
> (b) giving us test coverage that the message is sane, so that if we
> accidentally break the wording due to future changes to the analyzer,
> then at least one test starts failing
>
> Probably best for most of these regexps to be terse, but an empty
> regexp is too terse.
>
> The 2nd arg helps us disambiguate with directive we're talking about,
> so can be "warning" and "note" for the two above.

I've looked at other tests and added the regexps similar to those. The
notes are terse and only make sure that the size is printed
whenever its possible. Otherwise, I couldn't think of a way to provide a
terse regexp of the warning. But thats easily replacable because it is
just a constant string.

>
> [...snip...]
>
> > +void test_5 (void)
> > +{
> > +  int user_input;
> > +  scanf("%i", &user_input);
> > +  int n;
> > +  if (user_input == 0)
> > +    n = 21 * sizeof (short);
> > +  else
> > +    n = 42 * sizeof (short);
>
> I see you've used scanf, presumably to get a symbolic value for the
> variable.  If so, a simpler way to do this is to simply use a parameter
> to the test function.  But there's no need to change these test cases.
>
> Perhaps scanf should taint its arguments, which is
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106021 but obviously that
> would be a different patch.
>
> [...snip...]
>
> > +void test_9(void) 
> > +{
> > +  // FIXME
>
> Please make this comment more descriptive about what the issue here is.
>
> > +  int *buf = create_buffer(42); /* { dg-warning "" "" { xfail *-*-*
> } } */
> > +  free (buf);
> > +}
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c
> b/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c
> > new file mode 100644
> > index 00000000000..cb35a9d717b
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-2.c
>
> [...snip...]
>
> > +void test_8(void) 
> > +{
> > +  int n;
> > +  scanf("%i", &n);
> > +  // FIXME
>
> Make the comment more descriptive please.
>
> > +  int *buf = create_buffer(n * sizeof(short)); /* { dg-warning "" ""
> { xfail *-*-* } } */
> > +  free (buf);
> > +}
> > +
> > +void test_9 (void)
> > +{
> > +  int n;
> > +  scanf("%i", &n);
> > +  /* n is a conjured_svalue.  */
> > +  void *ptr = malloc (n); /* { dg-message } */
> > +  int *iptr = (int *)ptr; /* { dg-line assign9 } */
> > +  free (iptr);
> > +
> > +  /* { dg-warning "" "" { target *-*-* } assign9 } */
> > +  /* { dg-message "" "" { target *-*-* } assign9 } */
> > +}
> > +
> > +void test_11 (void)
> > +{
> > +  int n;
> > +  scanf("%i", &n);
> > +  void *ptr = malloc (n);
> > +  if (n == 4)
>
> Presumably this should be a test against sizeof (int), rather than 4?

In the patch, it didn't matter. As it seemed impossible to evaluate
whether 'n % sizeof (type) == 0' is true, I did fall back to assume that
a guarded variable is okay. In the new patch, I weakened the
approximation and actually use the equivalence classes, so that this
actually should be sizeof (int).

I will send the updated patch in another mail in some time after I have
formatted it.

- Tim

>
> Please add a testcase where the comparison is against the wrong
> constant.
>
> > +    {
> > +      /* n is a conjured_svalue but guarded.  */
> > +      int *iptr = (int *)ptr;
> > +      free (iptr);
> > +    }
> > +  else
> > +    free (ptr);
> > +}
>
> [...snip...]
>
> > diff --git a/gcc/testsuite/gcc.dg/analyzer/allocation-size-5.c
> b/gcc/testsuite/gcc.dg/analyzer/allocation-size-5.c
> > new file mode 100644
> > index 00000000000..afb1782e0cd
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/analyzer/allocation-size-5.c
> > @@ -0,0 +1,40 @@
> > +#include <stdlib.h>
> > +#include <stdio.h>
> > +
> > +/* Tests related to structs.  */
>
> Looks like this was copied-and-pasted, and should be updated to "Tests
> of statically-sized buffers" or somesuch.
>
> > +
> > +typedef struct a {
> > +  short s;
> > +} a;
> > +
> > +int *test_1 (void)
> > +{
> > +  a A;
> > +  A.s = 1;
> > +  int *ptr = (int *) &A; /* { dg-line assign1 } */
> > +  return ptr;
> > +
> > +  /* { dg-warning "" "" { target *-*-* } assign1 } */
> > +  /* { dg-message "" "" { target *-*-* } assign1 } */
> > +}
> > +
> > +int *test2 (void)
> > +{
> > +  char arr[4];
>
> I think this needs to be sizeof(int), rather than 4.
>
> > +  int *ptr = (int *)arr;
> > +  return ptr;
> > +}
> > +
> > +int *test3 (void)
> > +{
> > +  char arr[2];
> > +  int *ptr = (int *)arr; /* { dg-line assign3 } */
> > +  return ptr;
> > +
> > +  /* { dg-warning "" "" { target *-*-* } assign3 } */
> > +  /* { dg-message "" "" { target *-*-* } assign3 } */
> > +}
> > +
> > +int main() {
> > +  return 0;
> > +}
>
> [...snip...]
>
> Thanks again for the v2 patch; hope the above makes sense
> Dave

Reply via email to