> On Nov 30, 2024, at 07:22, Martin Uecker <uec...@tugraz.at> wrote: > > Am Dienstag, dem 26.11.2024 um 20:59 +0000 schrieb Qing Zhao: >> Think this over these days, I have another thought that need some feedback: >> >> The major issue right now is: >> >> 1. For the following structure in which the “counted_by” attributes is >> attached to the pointer field. >> >> struct foo { >> int n; >> char *p __attribute__ ((counted_by (n))); >> } *x; > > BTW: I also do not like the syntax 'n' for a lookup of a member > in the member namespace. I think it should be '.n'. For FAM this > is less problematic because it always at the end, but here it is > problematic. During my implementation of extending this attribute to pointer fields of structure. I didn’t notice issue with the current syntax ’n’ for the pointer fields so far, even though when the field “n” is declared after the corresponding pointer field, i.e:
struct foo { { char *p __attribute__ ((counted_by (n))); int n; } So, could you please explain a little bit more on what’s the potential issue here? >> >> There is one important additional requirement: >> >> x->n, x->p can ONLY be changed by changing the whole structure at the same >> time. >> Otherwise, x->n might not be consistent with x->p. > > By itself, this would still not fix the issue I pointed out. > > struct foo x; > x = .. ; // set the whole structure > char *p = x->p; > x = ... ; // set the whole structure > > What is the bound for 'p' ? Since p was set to the pointer field of the old structure, then the bound of it should be the old bound. > With current rules it would be the old bound. I thought that this should be the correct behavior, isn’t it? > > >> >> 2. This new requirement is ONLY for “counted_by” attribute that is attached >> to the pointer field, not needed >> for flexible array members. >> >> 3. Then there will be inconsistency for the “counted_by” attribute between >> FAM and pointer field. > > >> The major questions I have right now: >> >> 1. Shall we keep this inconsistency between FAM and pointer field? >> Or, >> >> 2. Shall we keep them consistent by adding this new requirement for the >> previous FAM? > > Or have a new attribute? I feel we double down on a bad design > which made sense for FAM addressing a very specific use case > (retrofitting checks to the Linux kernel) but is otherwise not > very strong. I do agree that the previous specific feature of the “counted_by" for the FAM, i.e: " One important feature of the attribute is, a reference to the flexible array member field uses the latest value assigned to the field that represents the number of the elements before that reference. For example, p->count = val1; p->array[20] = 0; // ref1 to p->array p->count = val2; p->array[30] = 0; // ref2 to p->array in the above, 'ref1' uses 'val1' as the number of the elements in 'p->array', and 'ref2' uses 'val2' as the number of elements in 'p->array’. “ Is a specific feature for Linux kernel, I am wondering whether the above feature really needed by the Linux kernel? If Not, I prefer to eliminate this specific feature from GCC before we officially release the “counted_by” attribute in GCC15. If Linux kernel does need this feature, Yes, maybe a new attribute for pointer is better. Kees, could you please also provide more comments and suggestion on this? > > Martin > >> >> Kees, the following feature that you requested for the FAM: >> >> " >> One important feature of the attribute is, a reference to the >> flexible array member field uses the latest value assigned to the >> field that represents the number of the elements before that >> reference. For example, >> >> p->count = val1; >> p->array[20] = 0; // ref1 to p->array >> p->count = val2; >> p->array[30] = 0; // ref2 to p->array >> >> in the above, 'ref1' uses 'val1' as the number of the elements in >> 'p->array', and 'ref2' uses 'val2' as the number of elements in >> 'p->array’. >> " >> Has this feature been used by Linux kernel already? >> Is this feature really needed by Linux kernel? >> >> Thanks a lot for suggestions and comments. >> >> Qing >> >>> On Nov 20, 2024, at 14:23, Martin Uecker <uec...@tugraz.at> wrote: >>> >>> Am Mittwoch, dem 20.11.2024 um 17:37 +0000 schrieb Qing Zhao: >>>> Hi, Martin, >>>> >>>> Thanks a lot for pointing this out. >>>> >>>> This does look like a problem we need avoid for the pointer arrays. >>>> >>>> Does the same problem exist in the language extension too if the n is >>>> allowed to be changed after initialization? >>>> >>>> If so, for the future language extension, is there any proposed solution >>>> to this problem? >>>> >>> >>> There is no specification yet and nothing formally proposed, so >>> it is entirely unclear at this point. >>> >>> My idea would be to give 'x->buf' the type '(char(*)[x->n])' >>> where 'x->n' is loaded at the same time 'x->buf' is accessed >>> and then the value is frozen (like in the simpler versions >>> of 'counted_by' you had implemented first). Of course, one >>> would then have to set x->n before setting the buffer (or >>> at the same time). This could be ensured by making the >>> member 'n' const, so that it can only be changed by >>> overwriting the whole struct. But I am still thinking >>> about this. >>> >>> In any case, I think for "counted_by" this is not an option >>> because it would be confusing if it works differently >>> than for the FAM case. >>> >>> Martin >>> >>> >>>> Qing >>>>> On Nov 20, 2024, at 10:52, Martin Uecker <uec...@tugraz.at> wrote: >>>>> >>>>> Am Mittwoch, dem 20.11.2024 um 15:27 +0000 schrieb Qing Zhao: >>>>>> >>>>>>> On Nov 19, 2024, at 10:47, Marek Polacek <pola...@redhat.com> wrote: >>>>>>> >>>>>>> On Mon, Nov 18, 2024 at 07:10:35PM +0100, Martin Uecker wrote: >>>>>>>> Am Montag, dem 18.11.2024 um 17:55 +0000 schrieb Qing Zhao: >>>>>>>>> Hi, >>>>>> >>>>> .. >>>>> >>>>> Hi Qing, >>>>> >>>>>> Per our discussion so far, if treating the following >>>>>> >>>>>> struct foo { >>>>>> int n; >>>>>> char *p __attribute__ ((counted_by (n))); >>>>>> }; >>>>>> >>>>>> as an array with upper-bounds being “n” is reasonable. >>>>> >>>>> There is one issue I want to point out, which I just realized during >>>>> a discussion in WG14. For "counted_by" we defined the semantics >>>>> in a way that later store to 'n' will be taken into account. >>>>> We did this to support the use case where 'n' is set after >>>>> the access to 'p'. >>>>> >>>>> struct foo *x = ; >>>>> >>>>> char *q = x->p; >>>>> x->n = 100; // this should apply >>>>> >>>>> >>>>> For FAMs this is fine, because it is a fixed part >>>>> of the struct. But for the new pointer case, I think this is >>>>> problematic. Consider this example: >>>>> >>>>> struct foo *x = allocate_buffer(100); >>>>> >>>>> where x->n is set to the right value in the allocation function. >>>>> >>>>> Now let's continue with >>>>> >>>>> char *q = x->p; >>>>> >>>>> x = allocate_buffer(50); >>>>> // x->n is set to 50. >>>>> >>>>> Now q refers to the old buffer, but x->n to the size of the new >>>>> buffer. That does not seem right and scares me a little bit. >>>>> >>>>> >>>>> >>>>> Martin >>>>> >>>>> >>>>>> >>>>>> Then, it’s reasonable to extend -fsanitize=bounds to instrument the >>>>>> corresponding reference for the pointer with >>>>>> Counted-by attribute. >>>>>> >>>>>> What do you think? >>>>>> >>>>>> Qing >>>>>> >>>>>>> >>>>>>>> I think the question is what -fsanitize=bounds is meant to be. >>>>>>>> >>>>>>>> I am a bit frustrated about the sanitizer. On the >>>>>>>> one hand, it is not doing enough to get spatial memory >>>>>>>> safety even where this would be easily possible, on the >>>>>>>> other hand, is pedantic about things which are technically >>>>>>>> UB but not problematic and then one is prevented from >>>>>>>> using it >>>>>>>> >>>>>>>> When used in default mode, where execution continues, it >>>>>>>> also does not mix well with many warning, creates more code, >>>>>>>> and pulls in a libary dependency (and the library also depends >>>>>>>> on upstream choices / progress which seems a limitation for >>>>>>>> extensions). >>>>>>>> >>>>>>>> What IMHO would be ideal is a protection mode for spatial >>>>>>>> memory safety that simply adds traps (which then requires >>>>>>>> no library, has no issues with other warnings, and could >>>>>>>> evolve independently from clang) >>>>>>>> >>>>>>>> So shouldn't we just add a -fboundscheck (which would >>>>>>>> be like -fsanitize=bounds -fsanitize-trap=bounds just with >>>>>>>> more checking) and make it really good? I think many people >>>>>>>> would be very happy about this. >>>>>>> >>>>>>> That's a separate concern. We already have the -fbounds-check option, >>>>>>> currently only used in Fortran (and D?), so perhaps we could make >>>>>>> that option a shorthand for -fsanitize=bounds -fsanitize-trap=bounds. >>>>>>> >>>>>>> Marek