> On Nov 19, 2024, at 12:30, Martin Uecker <uec...@tugraz.at> wrote:
> 
> Am Dienstag, dem 19.11.2024 um 09:18 -0800 schrieb Kees Cook:
>> On Tue, Nov 19, 2024 at 05:41:13PM +0100, Martin Uecker wrote:
>>> Am Dienstag, dem 19.11.2024 um 10:47 -0500 schrieb Marek Polacek:
>>>> On Mon, Nov 18, 2024 at 07:10:35PM +0100, Martin Uecker wrote:
>>>>> Am Montag, dem 18.11.2024 um 17:55 +0000 schrieb Qing Zhao:
>>>>>> Hi,
>>>>>> 
>>>>>> I am working on extending “counted_by” attribute to pointers inside a 
>>>>>> structure per our previous discussion. 
>>>>>> 
>>>>>> I need advice on the following question:
>>>>>> 
>>>>>> Should -fsantize=bounds support array reference that was referenced 
>>>>>> through a pointer that has counted_by attribute? 
>>>> 
>>>> I don't see why it couldn't, perhaps as part of -fsanitize=bounds-strict.
>>>> Someone has to implement it, though.
>>> 
>>> I think Qing was volunteering to do this.  My point was that
>>> this would not necessarily be undefined behavior, but instead
>>> could trap for possibly defined behavior.  I would not mind, but
>>> I point out that in the past people insisted that the sanitizers
>>> are only intended to screen for undefined behavior.
>> 
>> I think it's a mistake to confuse the sanitizers with only addressing
>> "undefined behavior". The UB sanitizers are just a subset of the
>> sanitizers in general, and I think UB is a not a good category for how
>> to group the behaviors.
>> 
>> For the Linux kernel, we want robustness. UB leads to ambiguity, so
>> we're quite interested in getting rid of UB, but the bounds sanitizer is
>> expected to implement bounds checking, regardless of UB-ness.
>> 
>> I would absolutely want -fsanitize=bounds to check the construct Qing
>> mentioned.
>> 
>> Another aspect I want to capture for Linux is _pointer_ bounds, so that
>> this would be caught:
>> 
>> #include <stdlib.h>
>> 
>> struct annotated {
>>  int b;
>>  int *c __attribute__ ((counted_by (b)));
>> } *p_array_annotated;
>> 
>> void __attribute__((__noinline__)) setup (int annotated_count)
>> {
>>  p_array_annotated
>>    = (struct annotated *)malloc (sizeof (struct annotated));
>>  p_array_annotated->c = (int *) malloc (annotated_count *  sizeof (int));
>>  p_array_annotated->b = annotated_count;
>> 
>>  return;
>> }
>> 
>> int main(int argc, char *argv[])
>> {
>>  int i;
>>  int *c;
>> 
>>  setup (10);
>>  c = p_array_annotated->c;
>>  for (i = 0; i < 11; i++)
>>    *c++ = 2; // goes boom at i == 10
>>  return 0;
>> }
>> 
>> This may be a separate sanitizer, and it may require a totally different
>> set of internal tracking, but being able to discover that we've run off
>> the end of an allocation is needed.
>> 
>> Of course, the biggest deal is that
>> __builtin_dynamic_object_size(p_array_annotated->c, 1) will return
>> 10 * sizeof(*p_array_annotated->c)
> 
> I want this too.

Good news is: 
__builtin_dynamic_object_size for a field pointer with “counted_by” attribute
 already worked in my private workspace with a minimum adjustment based on
the current implementation. -:)
This part is straightforward.


> The plan preliminary discussed in WG14 is to
> have a proper language extension for this, tentatively:
> 
> struct foo {
>  int n;
>  char (*p)[.n];
> };
> 
> (details to change, the syntax is what I would like to havE)

This is nice!  hopefully this could be included into C standard in the near 
future. 

Then, similarly, treating the following:

struct foo {
  int n;
  char *p __attribute__ ((counted_by (n)));
};

as an array with upper-bounds being “n” is reasonable. 

Thanks.

Qing

> 
> 
> 
>>> 
>>>> 
>>>>> I think the question is what -fsanitize=bounds is meant to be.
>>>>> 
>>>>> I am a bit frustrated about the sanitizer.  On the
>>>>> one hand, it is not doing enough to get spatial memory
>>>>> safety even where this would be easily possible, on the
>>>>> other hand, is pedantic about things which are technically
>>>>> UB but not problematic and then one is prevented from
>>>>> using it
>>>>> 
>>>>> When used in default mode, where execution continues, it
>>>>> also does not mix well with many warning, creates more code,
>>>>> and pulls in a libary dependency (and the library also depends
>>>>> on upstream choices / progress which seems a limitation for
>>>>> extensions).
>>>>> 
>>>>> What IMHO would be ideal is a protection mode for spatial
>>>>> memory safety that simply adds traps (which then requires
>>>>> no library, has no issues with other warnings, and could
>>>>> evolve independently from clang) 
>>>>> 
>>>>> So shouldn't we just add a -fboundscheck (which would 
>>>>> be like -fsanitize=bounds -fsanitize-trap=bounds just with
>>>>> more checking) and make it really good? I think many people
>>>>> would be very happy about this.
>>>> 
>>>> That's a separate concern.  We already have the -fbounds-check option,
>>>> currently only used in Fortran (and D?), so perhaps we could make
>>>> that option a shorthand for -fsanitize=bounds -fsanitize-trap=bounds.
>>> 
>>> I think it could share large parts of the implementation, but the
>>> main reason for having a separate option would be to do something
>>> better than the sanitizer.  So it could not simply be a shorthand.
>> 
>> I don't want to reinvent the wheel here -- the sanitizers already have 3
>> modes of operation (trap, callback with details, callback without
>> details), and Linux uses the first 2 modes already, and has had plans to
>> use the third (smaller resulting image).
>> 
>> Most notably, Linux _must_ have a warn-only mode or the feature will
>> never get merged (this is a hard requirement from Linus). All serious
>> deployments of the feature will use either trap mode or use the
>> trap-on-warn setting, of course. But for the feature to even see the
>> light of day, Linus requires there be a warn-only mode.
>> 
>> So, given these requirements, continuing to use the sanitizer framework
>> seems much simpler to me. :)
> 
> This sounds reasonable. My remaining issue with the sanitizers
> is that they are difficult to extend.  If they had a single
> entry point where you can synthesize a call that simply passes
> the right message then this would be much easier.
> 
> Martin


Reply via email to