> On Nov 19, 2024, at 12:30, Martin Uecker <uec...@tugraz.at> wrote: > > Am Dienstag, dem 19.11.2024 um 09:18 -0800 schrieb Kees Cook: >> On Tue, Nov 19, 2024 at 05:41:13PM +0100, Martin Uecker wrote: >>> Am Dienstag, dem 19.11.2024 um 10:47 -0500 schrieb Marek Polacek: >>>> On Mon, Nov 18, 2024 at 07:10:35PM +0100, Martin Uecker wrote: >>>>> Am Montag, dem 18.11.2024 um 17:55 +0000 schrieb Qing Zhao: >>>>>> Hi, >>>>>> >>>>>> I am working on extending “counted_by” attribute to pointers inside a >>>>>> structure per our previous discussion. >>>>>> >>>>>> I need advice on the following question: >>>>>> >>>>>> Should -fsantize=bounds support array reference that was referenced >>>>>> through a pointer that has counted_by attribute? >>>> >>>> I don't see why it couldn't, perhaps as part of -fsanitize=bounds-strict. >>>> Someone has to implement it, though. >>> >>> I think Qing was volunteering to do this. My point was that >>> this would not necessarily be undefined behavior, but instead >>> could trap for possibly defined behavior. I would not mind, but >>> I point out that in the past people insisted that the sanitizers >>> are only intended to screen for undefined behavior. >> >> I think it's a mistake to confuse the sanitizers with only addressing >> "undefined behavior". The UB sanitizers are just a subset of the >> sanitizers in general, and I think UB is a not a good category for how >> to group the behaviors. >> >> For the Linux kernel, we want robustness. UB leads to ambiguity, so >> we're quite interested in getting rid of UB, but the bounds sanitizer is >> expected to implement bounds checking, regardless of UB-ness. >> >> I would absolutely want -fsanitize=bounds to check the construct Qing >> mentioned. >> >> Another aspect I want to capture for Linux is _pointer_ bounds, so that >> this would be caught: >> >> #include <stdlib.h> >> >> struct annotated { >> int b; >> int *c __attribute__ ((counted_by (b))); >> } *p_array_annotated; >> >> void __attribute__((__noinline__)) setup (int annotated_count) >> { >> p_array_annotated >> = (struct annotated *)malloc (sizeof (struct annotated)); >> p_array_annotated->c = (int *) malloc (annotated_count * sizeof (int)); >> p_array_annotated->b = annotated_count; >> >> return; >> } >> >> int main(int argc, char *argv[]) >> { >> int i; >> int *c; >> >> setup (10); >> c = p_array_annotated->c; >> for (i = 0; i < 11; i++) >> *c++ = 2; // goes boom at i == 10 >> return 0; >> } >> >> This may be a separate sanitizer, and it may require a totally different >> set of internal tracking, but being able to discover that we've run off >> the end of an allocation is needed. >> >> Of course, the biggest deal is that >> __builtin_dynamic_object_size(p_array_annotated->c, 1) will return >> 10 * sizeof(*p_array_annotated->c) > > I want this too.
Good news is: __builtin_dynamic_object_size for a field pointer with “counted_by” attribute already worked in my private workspace with a minimum adjustment based on the current implementation. -:) This part is straightforward. > The plan preliminary discussed in WG14 is to > have a proper language extension for this, tentatively: > > struct foo { > int n; > char (*p)[.n]; > }; > > (details to change, the syntax is what I would like to havE) This is nice! hopefully this could be included into C standard in the near future. Then, similarly, treating the following: struct foo { int n; char *p __attribute__ ((counted_by (n))); }; as an array with upper-bounds being “n” is reasonable. Thanks. Qing > > > >>> >>>> >>>>> I think the question is what -fsanitize=bounds is meant to be. >>>>> >>>>> I am a bit frustrated about the sanitizer. On the >>>>> one hand, it is not doing enough to get spatial memory >>>>> safety even where this would be easily possible, on the >>>>> other hand, is pedantic about things which are technically >>>>> UB but not problematic and then one is prevented from >>>>> using it >>>>> >>>>> When used in default mode, where execution continues, it >>>>> also does not mix well with many warning, creates more code, >>>>> and pulls in a libary dependency (and the library also depends >>>>> on upstream choices / progress which seems a limitation for >>>>> extensions). >>>>> >>>>> What IMHO would be ideal is a protection mode for spatial >>>>> memory safety that simply adds traps (which then requires >>>>> no library, has no issues with other warnings, and could >>>>> evolve independently from clang) >>>>> >>>>> So shouldn't we just add a -fboundscheck (which would >>>>> be like -fsanitize=bounds -fsanitize-trap=bounds just with >>>>> more checking) and make it really good? I think many people >>>>> would be very happy about this. >>>> >>>> That's a separate concern. We already have the -fbounds-check option, >>>> currently only used in Fortran (and D?), so perhaps we could make >>>> that option a shorthand for -fsanitize=bounds -fsanitize-trap=bounds. >>> >>> I think it could share large parts of the implementation, but the >>> main reason for having a separate option would be to do something >>> better than the sanitizer. So it could not simply be a shorthand. >> >> I don't want to reinvent the wheel here -- the sanitizers already have 3 >> modes of operation (trap, callback with details, callback without >> details), and Linux uses the first 2 modes already, and has had plans to >> use the third (smaller resulting image). >> >> Most notably, Linux _must_ have a warn-only mode or the feature will >> never get merged (this is a hard requirement from Linus). All serious >> deployments of the feature will use either trap mode or use the >> trap-on-warn setting, of course. But for the feature to even see the >> light of day, Linus requires there be a warn-only mode. >> >> So, given these requirements, continuing to use the sanitizer framework >> seems much simpler to me. :) > > This sounds reasonable. My remaining issue with the sanitizers > is that they are difficult to extend. If they had a single > entry point where you can synthesize a call that simply passes > the right message then this would be much easier. > > Martin