On Tue, Nov 19, 2024 at 05:41:13PM +0100, Martin Uecker wrote:
> Am Dienstag, dem 19.11.2024 um 10:47 -0500 schrieb Marek Polacek:
> > On Mon, Nov 18, 2024 at 07:10:35PM +0100, Martin Uecker wrote:
> > > Am Montag, dem 18.11.2024 um 17:55 +0000 schrieb Qing Zhao:
> > > > Hi,
> > > > 
> > > > I am working on extending “counted_by” attribute to pointers inside a 
> > > > structure per our previous discussion. 
> > > > 
> > > > I need advice on the following question:
> > > > 
> > > > Should -fsantize=bounds support array reference that was referenced 
> > > > through a pointer that has counted_by attribute? 
> > 
> > I don't see why it couldn't, perhaps as part of -fsanitize=bounds-strict.
> > Someone has to implement it, though.
> 
> I think Qing was volunteering to do this.  My point was that
> this would not necessarily be undefined behavior, but instead
> could trap for possibly defined behavior.  I would not mind, but
> I point out that in the past people insisted that the sanitizers
> are only intended to screen for undefined behavior.

I think it's a mistake to confuse the sanitizers with only addressing
"undefined behavior". The UB sanitizers are just a subset of the
sanitizers in general, and I think UB is a not a good category for how
to group the behaviors.

For the Linux kernel, we want robustness. UB leads to ambiguity, so
we're quite interested in getting rid of UB, but the bounds sanitizer is
expected to implement bounds checking, regardless of UB-ness.

I would absolutely want -fsanitize=bounds to check the construct Qing
mentioned.

Another aspect I want to capture for Linux is _pointer_ bounds, so that
this would be caught:

#include <stdlib.h>

struct annotated {
  int b;
  int *c __attribute__ ((counted_by (b)));
} *p_array_annotated;

void __attribute__((__noinline__)) setup (int annotated_count)
{
  p_array_annotated
    = (struct annotated *)malloc (sizeof (struct annotated));
  p_array_annotated->c = (int *) malloc (annotated_count *  sizeof (int));
  p_array_annotated->b = annotated_count;

  return;
}

int main(int argc, char *argv[])
{
  int i;
  int *c;

  setup (10);
  c = p_array_annotated->c;
  for (i = 0; i < 11; i++)
    *c++ = 2; // goes boom at i == 10
  return 0;
}

This may be a separate sanitizer, and it may require a totally different
set of internal tracking, but being able to discover that we've run off
the end of an allocation is needed.

Of course, the biggest deal is that
__builtin_dynamic_object_size(p_array_annotated->c, 1) will return
10 * sizeof(*p_array_annotated->c)

> 
> >  
> > > I think the question is what -fsanitize=bounds is meant to be.
> > > 
> > > I am a bit frustrated about the sanitizer.  On the
> > > one hand, it is not doing enough to get spatial memory
> > > safety even where this would be easily possible, on the
> > > other hand, is pedantic about things which are technically
> > > UB but not problematic and then one is prevented from
> > > using it
> > > 
> > > When used in default mode, where execution continues, it
> > > also does not mix well with many warning, creates more code,
> > > and pulls in a libary dependency (and the library also depends
> > > on upstream choices / progress which seems a limitation for
> > > extensions).
> > > 
> > > What IMHO would be ideal is a protection mode for spatial
> > > memory safety that simply adds traps (which then requires
> > > no library, has no issues with other warnings, and could
> > > evolve independently from clang) 
> > > 
> > > So shouldn't we just add a -fboundscheck (which would 
> > > be like -fsanitize=bounds -fsanitize-trap=bounds just with
> > > more checking) and make it really good? I think many people
> > > would be very happy about this.
> > 
> > That's a separate concern.  We already have the -fbounds-check option,
> > currently only used in Fortran (and D?), so perhaps we could make
> > that option a shorthand for -fsanitize=bounds -fsanitize-trap=bounds.
> 
> I think it could share large parts of the implementation, but the
> main reason for having a separate option would be to do something
> better than the sanitizer.  So it could not simply be a shorthand.

I don't want to reinvent the wheel here -- the sanitizers already have 3
modes of operation (trap, callback with details, callback without
details), and Linux uses the first 2 modes already, and has had plans to
use the third (smaller resulting image).

Most notably, Linux _must_ have a warn-only mode or the feature will
never get merged (this is a hard requirement from Linus). All serious
deployments of the feature will use either trap mode or use the
trap-on-warn setting, of course. But for the feature to even see the
light of day, Linus requires there be a warn-only mode.

So, given these requirements, continuing to use the sanitizer framework
seems much simpler to me. :)

-Kees

-- 
Kees Cook

Reply via email to