> On Nov 30, 2024, at 07:22, Martin Uecker <[email protected]> wrote:
>
> Am Dienstag, dem 26.11.2024 um 20:59 +0000 schrieb Qing Zhao:
>> Think this over these days, I have another thought that need some feedback:
>>
>> The major issue right now is:
>>
>> 1. For the following structure in which the “counted_by” attributes is
>> attached to the pointer field.
>>
>> struct foo {
>> int n;
>> char *p __attribute__ ((counted_by (n)));
>> } *x;
>
> BTW: I also do not like the syntax 'n' for a lookup of a member
> in the member namespace. I think it should be '.n'. For FAM this
> is less problematic because it always at the end, but here it is
> problematic.
During my implementation of extending this attribute to pointer fields of
structure.
I didn’t notice issue with the current syntax ’n’ for the pointer fields so
far,
even though when the field “n” is declared after the corresponding pointer
field, i.e:
struct foo {
{
char *p __attribute__ ((counted_by (n)));
int n;
}
So, could you please explain a little bit more on what’s the potential issue
here?
>>
>> There is one important additional requirement:
>>
>> x->n, x->p can ONLY be changed by changing the whole structure at the same
>> time.
>> Otherwise, x->n might not be consistent with x->p.
>
> By itself, this would still not fix the issue I pointed out.
>
> struct foo x;
> x = .. ; // set the whole structure
> char *p = x->p;
> x = ... ; // set the whole structure
>
> What is the bound for 'p' ?
Since p was set to the pointer field of the old structure, then the bound of it
should be the old bound.
> With current rules it would be the old bound.
I thought that this should be the correct behavior, isn’t it?
>
>
>>
>> 2. This new requirement is ONLY for “counted_by” attribute that is attached
>> to the pointer field, not needed
>> for flexible array members.
>>
>> 3. Then there will be inconsistency for the “counted_by” attribute between
>> FAM and pointer field.
>
>
>> The major questions I have right now:
>>
>> 1. Shall we keep this inconsistency between FAM and pointer field?
>> Or,
>>
>> 2. Shall we keep them consistent by adding this new requirement for the
>> previous FAM?
>
> Or have a new attribute? I feel we double down on a bad design
> which made sense for FAM addressing a very specific use case
> (retrofitting checks to the Linux kernel) but is otherwise not
> very strong.
I do agree that the previous specific feature of the “counted_by" for the FAM,
i.e:
"
One important feature of the attribute is, a reference to the
flexible array member field uses the latest value assigned to the
field that represents the number of the elements before that
reference. For example,
p->count = val1;
p->array[20] = 0; // ref1 to p->array
p->count = val2;
p->array[30] = 0; // ref2 to p->array
in the above, 'ref1' uses 'val1' as the number of the elements in
'p->array', and 'ref2' uses 'val2' as the number of elements in
'p->array’.
“
Is a specific feature for Linux kernel, I am wondering whether the above feature
really needed by the Linux kernel?
If Not, I prefer to eliminate this specific feature from GCC before we
officially release the “counted_by”
attribute in GCC15.
If Linux kernel does need this feature, Yes, maybe a new attribute for pointer
is better.
Kees, could you please also provide more comments and suggestion on this?
>
> Martin
>
>>
>> Kees, the following feature that you requested for the FAM:
>>
>> "
>> One important feature of the attribute is, a reference to the
>> flexible array member field uses the latest value assigned to the
>> field that represents the number of the elements before that
>> reference. For example,
>>
>> p->count = val1;
>> p->array[20] = 0; // ref1 to p->array
>> p->count = val2;
>> p->array[30] = 0; // ref2 to p->array
>>
>> in the above, 'ref1' uses 'val1' as the number of the elements in
>> 'p->array', and 'ref2' uses 'val2' as the number of elements in
>> 'p->array’.
>> "
>> Has this feature been used by Linux kernel already?
>> Is this feature really needed by Linux kernel?
>>
>> Thanks a lot for suggestions and comments.
>>
>> Qing
>>
>>> On Nov 20, 2024, at 14:23, Martin Uecker <[email protected]> wrote:
>>>
>>> Am Mittwoch, dem 20.11.2024 um 17:37 +0000 schrieb Qing Zhao:
>>>> Hi, Martin,
>>>>
>>>> Thanks a lot for pointing this out.
>>>>
>>>> This does look like a problem we need avoid for the pointer arrays.
>>>>
>>>> Does the same problem exist in the language extension too if the n is
>>>> allowed to be changed after initialization?
>>>>
>>>> If so, for the future language extension, is there any proposed solution
>>>> to this problem?
>>>>
>>>
>>> There is no specification yet and nothing formally proposed, so
>>> it is entirely unclear at this point.
>>>
>>> My idea would be to give 'x->buf' the type '(char(*)[x->n])'
>>> where 'x->n' is loaded at the same time 'x->buf' is accessed
>>> and then the value is frozen (like in the simpler versions
>>> of 'counted_by' you had implemented first). Of course, one
>>> would then have to set x->n before setting the buffer (or
>>> at the same time). This could be ensured by making the
>>> member 'n' const, so that it can only be changed by
>>> overwriting the whole struct. But I am still thinking
>>> about this.
>>>
>>> In any case, I think for "counted_by" this is not an option
>>> because it would be confusing if it works differently
>>> than for the FAM case.
>>>
>>> Martin
>>>
>>>
>>>> Qing
>>>>> On Nov 20, 2024, at 10:52, Martin Uecker <[email protected]> wrote:
>>>>>
>>>>> Am Mittwoch, dem 20.11.2024 um 15:27 +0000 schrieb Qing Zhao:
>>>>>>
>>>>>>> On Nov 19, 2024, at 10:47, Marek Polacek <[email protected]> wrote:
>>>>>>>
>>>>>>> On Mon, Nov 18, 2024 at 07:10:35PM +0100, Martin Uecker wrote:
>>>>>>>> Am Montag, dem 18.11.2024 um 17:55 +0000 schrieb Qing Zhao:
>>>>>>>>> Hi,
>>>>>>
>>>>> ..
>>>>>
>>>>> Hi Qing,
>>>>>
>>>>>> Per our discussion so far, if treating the following
>>>>>>
>>>>>> struct foo {
>>>>>> int n;
>>>>>> char *p __attribute__ ((counted_by (n)));
>>>>>> };
>>>>>>
>>>>>> as an array with upper-bounds being “n” is reasonable.
>>>>>
>>>>> There is one issue I want to point out, which I just realized during
>>>>> a discussion in WG14. For "counted_by" we defined the semantics
>>>>> in a way that later store to 'n' will be taken into account.
>>>>> We did this to support the use case where 'n' is set after
>>>>> the access to 'p'.
>>>>>
>>>>> struct foo *x = ;
>>>>>
>>>>> char *q = x->p;
>>>>> x->n = 100; // this should apply
>>>>>
>>>>>
>>>>> For FAMs this is fine, because it is a fixed part
>>>>> of the struct. But for the new pointer case, I think this is
>>>>> problematic. Consider this example:
>>>>>
>>>>> struct foo *x = allocate_buffer(100);
>>>>>
>>>>> where x->n is set to the right value in the allocation function.
>>>>>
>>>>> Now let's continue with
>>>>>
>>>>> char *q = x->p;
>>>>>
>>>>> x = allocate_buffer(50);
>>>>> // x->n is set to 50.
>>>>>
>>>>> Now q refers to the old buffer, but x->n to the size of the new
>>>>> buffer. That does not seem right and scares me a little bit.
>>>>>
>>>>>
>>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>>>
>>>>>> Then, it’s reasonable to extend -fsanitize=bounds to instrument the
>>>>>> corresponding reference for the pointer with
>>>>>> Counted-by attribute.
>>>>>>
>>>>>> What do you think?
>>>>>>
>>>>>> Qing
>>>>>>
>>>>>>>
>>>>>>>> I think the question is what -fsanitize=bounds is meant to be.
>>>>>>>>
>>>>>>>> I am a bit frustrated about the sanitizer. On the
>>>>>>>> one hand, it is not doing enough to get spatial memory
>>>>>>>> safety even where this would be easily possible, on the
>>>>>>>> other hand, is pedantic about things which are technically
>>>>>>>> UB but not problematic and then one is prevented from
>>>>>>>> using it
>>>>>>>>
>>>>>>>> When used in default mode, where execution continues, it
>>>>>>>> also does not mix well with many warning, creates more code,
>>>>>>>> and pulls in a libary dependency (and the library also depends
>>>>>>>> on upstream choices / progress which seems a limitation for
>>>>>>>> extensions).
>>>>>>>>
>>>>>>>> What IMHO would be ideal is a protection mode for spatial
>>>>>>>> memory safety that simply adds traps (which then requires
>>>>>>>> no library, has no issues with other warnings, and could
>>>>>>>> evolve independently from clang)
>>>>>>>>
>>>>>>>> So shouldn't we just add a -fboundscheck (which would
>>>>>>>> be like -fsanitize=bounds -fsanitize-trap=bounds just with
>>>>>>>> more checking) and make it really good? I think many people
>>>>>>>> would be very happy about this.
>>>>>>>
>>>>>>> That's a separate concern. We already have the -fbounds-check option,
>>>>>>> currently only used in Fortran (and D?), so perhaps we could make
>>>>>>> that option a shorthand for -fsanitize=bounds -fsanitize-trap=bounds.
>>>>>>>
>>>>>>> Marek