Re: Should -fsanitize=bounds support counted-by attribute for pointers inside a structure?

Martin Uecker Thu, 12 Dec 2024 14:37:49 -0800

Am Donnerstag, dem 12.12.2024 um 13:59 -0800 schrieb Bill Wendling:
> On Thu, Dec 12, 2024 at 1:28 PM Martin Uecker <[email protected]> wrote:
> > 
> > Am Montag, dem 09.12.2024 um 16:20 +0000 schrieb Qing Zhao:
> > > 
> > > > On Dec 7, 2024, at 03:57, Martin Uecker <[email protected]> wrote:
> > > > 
> > > > Am Freitag, dem 06.12.2024 um 16:13 +0000 schrieb Qing Zhao:
> > > > > 
> > > > > > On Dec 6, 2024, at 10:56, Martin Uecker <[email protected]> wrote:
> > > > > > 
> > > > > > Am Freitag, dem 06.12.2024 um 14:16 +0000 schrieb Qing Zhao:
> > > > > > > 
> > > > 
> > > > ...
> > > > 
> > > > > > > > 
> > > > > > > > I think the relevant scenario is where you assign the struct and
> > > > > > > > not a pointer to the struct, i.e. something like the following:
> > > > > > > > 
> > > > > > > > #include "builtin-object-size-common.h"
> > > > > > > > 
> > > > > > > > struct annotated {
> > > > > > > > int b;
> > > > > > > > int *c __attribute__ ((counted_by (b)));
> > > > > > > > };
> > > > > > > > 
> > > > > > > > struct annotated __attribute__((__noinline__)) setup (int 
> > > > > > > > attr_count)
> > > > > > > > {
> > > > > > > > struct annotated p_array_annotated;
> > > > > > > > p_array_annotated.c = (int *) malloc (sizeof (int) * 
> > > > > > > > attr_count);
> > > > > > > > p_array_annotated.b = attr_count;
> > > > > > > > 
> > > > > > > > return p_array_annotated;
> > > > > > > > }
> > > > > > > > 
> > > > > > > > 
> > > > > > > > int main(int argc, char *argv[])
> > > > > > > > {
> > > > > > > > struct annotated x = setup (10);
> > > > > > > > int *p = x.c;
> > > > > > > > x = setup (20);
> > > > > > > > EXPECT(__builtin_dynamic_object_size (p, 1), 10 * sizeof (int));
> > > > > > > > EXPECT(__builtin_dynamic_object_size (x.c, 1), 20 * sizeof 
> > > > > > > > (int));
> > > > > > > > DONE ();
> > > > > > > > }
> > > > > > > > 
> > > > > > > 
> > > > > > > With the above testing case, my current implementation based on 
> > > > > > > .ACCESS_WITH_SIZE succeed without any issue.  -:)
> > > > > > > The design of .ACCESS_WITH_SIZE already resolved this issue.
> > > > > > 
> > > > > > Ok, thanks!  But I am a bit confused, because it seems it behaves
> > > > > > this way also for FAMs
> > > > > > 
> > > > > > https://godbolt.org/z/64a6z4cna
> > > > > 
> > > > > The behavior of the above testing case is exactly the additional 
> > > > > feature we provided for counted_by attribute for FAM:
> > > > > 
> > > > > =====
> > > > >     One important feature of the attribute is, a reference to the
> > > > >     flexible array member field uses the latest value assigned to the
> > > > >     field that represents the number of the elements before that
> > > > >     reference.  For example,
> > > > > 
> > > > >            p->count = val1;
> > > > >            p->array[20] = 0;  // ref1 to p->array
> > > > >            p->count = val2;
> > > > >            p->array[30] = 0;  // ref2 to p->array
> > > > > 
> > > > >     in the above, 'ref1' uses 'val1' as the number of the elements in
> > > > >     'p->array', and 'ref2' uses 'val2' as the number of elements in
> > > > >     'p->array’.
> > > > > =====
> > > > > 
> > > > > So, it’s the correct behavior for the counted_by attribute for FAM 
> > > > > based on our previous discussion and agreement.
> > > > 
> > > > If it is indeed that the value of p->count last stored before p->array 
> > > > is
> > > > *referenced* which counts, then everything is well.
> > > 
> > > Yes, For FAM, every “reference” to p->array will be converted as a call 
> > > to (*.ACCESS_WITH_SIZE (p->array, &p->count, …))
> > 
> > Can you remind why we have to pass the address of p->count, i.e. &p->count
> > instead of its value?
> > 
> So that if we change the value of p->count it will be reflected in
> future checks.
> 
> p->count = n;
> p->array[3] = x;
> // ...
> p->count = m;
> p->array[3] = y;
> 
> We would want the last "p->array[3]" to be checked against the new
> value of p->count rather than the original value.


But wouldn't at the second access to p->array not a new call to
.ACCESS_WITH_SIZE be inserted?

> 
> > > 
> > > The count value for p->array is  *(&p->count), which is guaranteed to be 
> > > the last stored value of the address of p->count before the current 
> > > reference to p->array.
> > > 
> > > Similarly, for the pointer array,  every “reference” to p->pa will be 
> > > converted as a call to .ACCESS_WITH_SIZE(p->pa, &p->count…). The count 
> > > value of the pointer array p->pa is *(&p->count), which is also 
> > > guaranteed to be the last stored value of the address of p->count before 
> > > the current reference to p->pa.
> > > 
> > > > Somehow I thought for FAMs it is the value p->count last stored before
> > > > p->array is *accessed* (possibly indirectly via another pointer).  
> > > > Probably
> > > > it was just me being confused.
> > > > 
> > > > > 
> > > > > However, as you pointed out, when the “counted_by” attribute is 
> > > > > extended to  the pointer field, this feature will be problematic.
> > > > > And we need to add the following additional new requirement for the 
> > > > > “counted_by” attribute of pointer field:
> > > > > 
> > > > > p->count and  p->array  can only be changed by changing the whole 
> > > > > structure at the same time.
> > > > 
> > > > Actually, I am then not even sure we need this requirement. My point 
> > > > was only that
> > > > setting the whole structure at the time should work correctly, i.e. 
> > > > without changing
> > > > the bounds for old pointers which were stored in the struct previously. 
> > > >  With the
> > > > semantics  above it seems this case also works automatically.
> > > 
> > > For pointer field with counted_by attribute, if the p->count and p->pa 
> > > are not set together when changing the whole structure, then for example:
> > > 
> > > struct annotated {
> > >   int b;
> > >   int *c __attribute__ ((counted_by (b)));
> > > };
> > > 
> > > /* note, this routine only allocate the space for the pointer array 
> > > field, but does NOT set the counted_by field.  */
> > > struct annotated __attribute__((__noinline__)) setup (int attr_count)
> > > {
> > >   struct annotated p_array_annotated;
> > >   p_array_annotated.c = (int *) malloc (sizeof (int) * attr_count);
> > > 
> > >   return p_array_annotated;
> > > }
> > > 
> > > int main(int argc, char *argv[])
> > > {
> > >   struct annotated x = setup (10);
> > >   x.b = 10;  /* the counted_by is set here.  */
> > >   int *p = x.c;       /* x.c’s count should be 10, which is correct.  */
> > >   x = setup (20);
> > >   __builtin_dynamic_object_size (x.c, 1); /* x.c’s count now should be 
> > > 20, but it’s not set yet,
> > > the object size is unknown at this moment.  Shall we ask the user to 
> > > always set the
> > > counted_by value when the corresponding pointer is changed?  */
> > 
> > I think the requirement should simply be that the counted_by value
> > is set before the pointer is assigned.
> > 
> > We could then also check that the pointer which is assigned does
> > not point into something smaller at the time where it is assigned.
> > 
> This may be a bit off topic for this email, but I've been thinking
> about performing bounds checking on an array that was assigned to a
> local variable. Bounds checking via __counted_by requires the
> __counted_by attribute be available, however it's *not* available if
> we assign the pointer to a local variable. 

It is available:

https://godbolt.org/z/KbqMPYfK3
> 
> In Qing's example:
> 
>   struct annotated {
>     int b;
>     int *c __attribute__((counted_by(b)));
>   };
> 
>   struct annotated __attribute__((__noinline__)) setup(int attr_count) {
>     struct annotated *p;
> 
>     p = (struct annotated *)malloc(sizeof(struct annotated));
>     p->b = attr_count;
>     p->c = (int *)malloc(sizeof(int) * attr_count);
> 
>     return p;
>   }
> 
>   int main(int argc, char **argv) {
>     struct annotated *x = setup(10);
>     int *p;
> 
>     p = x->c; // Loses the __counted_by attribute.
>     x = setup(20);
>     p[15] = argc;  // Can't check.
> 
>     return 0;
>   }
> 
> The problem is the same as if the pointer were passed into a function.
> (I'm not sure about an inlined function...maybe in GCC? Clang does
> inling very differently.) There could also be complex code between
> assigning to the local variable and where a check occurs that could
> cause issues (though GCC's .ACCESS_WITH_SIZE could be robust enough to
> deal with that). Say for instance that somewhere in the code path the
> local is stored:
> 
>   int **store;
> 
>   store[37] = p;
> 
>   // code where 'store' is used a lot.
> 
>   p = store[37];
> 
>   printf("value %d\n", p[20]);  // Is it possible to add a check here?
> 
> All of this to say, it might be a good first step to simply ensure
> that we will perform the sanitizer steps only when accessed via a
> pointer to the struct. It might even avoid confusion by the programmer
> if checks are performed in some cases (access through the struct
> pointer, inlined functions, etc.) but not in others (passed into
> non-inlined functions, complex code paths where it's hard to determine
> the original source of the pointer).

For me, the clearest way is to use C99's variably modified types
which allows passing the bound into a function in another
TU. I used similar features in my experimental container library:


https://godbolt.org/z/o6j896Th5

But this does not help existing code.

Martin

Re: Should -fsanitize=bounds support counted-by attribute for pointers inside a structure?

Reply via email to