> On Oct 23, 2023, at 3:37 PM, Martin Uecker <uec...@tugraz.at> wrote: > > Am Montag, dem 23.10.2023 um 19:00 +0000 schrieb Qing Zhao: >> >>> On Oct 23, 2023, at 2:31 PM, Martin Uecker <uec...@tugraz.at> wrote: >>> >>> Am Montag, dem 23.10.2023 um 20:06 +0200 schrieb Martin Uecker: >>>> Am Montag, dem 23.10.2023 um 16:37 +0000 schrieb Qing Zhao: >>>>> >>>>>> On Oct 23, 2023, at 11:57 AM, Richard Biener >>>>>> <richard.guent...@gmail.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Am 23.10.2023 um 16:56 schrieb Qing Zhao <qing.z...@oracle.com>: >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Oct 23, 2023, at 3:57 AM, Richard Biener >>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>> >>>>>>>>> On Fri, Oct 20, 2023 at 10:41 PM Qing Zhao <qing.z...@oracle.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Oct 20, 2023, at 3:10 PM, Siddhesh Poyarekar >>>>>>>>>> <siddh...@gotplt.org> wrote: >>>>>>>>>> >>>>>>>>>> On 2023-10-20 14:38, Qing Zhao wrote: >>>>>>>>>>> How about the following: >>>>>>>>>>> Add one more parameter to __builtin_dynamic_object_size(), i.e >>>>>>>>>>> __builtin_dynamic_object_size (_1,1,array_annotated->foo)? >>>>>>>>>>> When we see the structure field has counted_by attribute. >>>>>>>>>> >>>>>>>>>> Or maybe add a barrier preventing any assignments to >>>>>>>>>> array_annotated->foo from being reordered below the __bdos call? >>>>>>>>>> Basically an __asm__ with array_annotated->foo in the clobber list >>>>>>>>>> ought to do it I think. >>>>>>>>> >>>>>>>>> Maybe just adding the array_annotated->foo to the use list of the >>>>>>>>> call to __builtin_dynamic_object_size should be enough? >>>>>>>>> >>>>>>>>> But I am not sure how to implement this in the TREE level, is there a >>>>>>>>> USE_LIST/CLOBBER_LIST for each call? Then I can just simply add the >>>>>>>>> counted_by field “array_annotated->foo” to the USE_LIST of the call >>>>>>>>> to __bdos? >>>>>>>>> >>>>>>>>> This might be the simplest solution? >>>>>>>> >>>>>>>> If the dynamic object size is derived of a field then I think you need >>>>>>>> to >>>>>>>> put the "load" of that memory location at the point (as argument) >>>>>>>> of the __bos call right at parsing time. I know that's awkward because >>>>>>>> you try to play tricks "discovering" that field only late, but that's >>>>>>>> not >>>>>>>> going to work. >>>>>>> >>>>>>> Is it better to do this at gimplification phase instead of FE? >>>>>>> >>>>>>> VLA decls are handled in gimplification phase, the size calculation and >>>>>>> call to alloca are all generated during this phase. (gimplify_vla_decl). >>>>>>> >>>>>>> For __bdos calls, we can add an additional argument if the object’s >>>>>>> first argument’s type include the counted_by attribute, i.e >>>>>>> >>>>>>> ***During gimplification, >>>>>>> For a call to __builtin_dynamic_object_size (ptr, type) >>>>>>> Check whether the type of ptr includes counted_by attribute, if so, >>>>>>> change the call to >>>>>>> __builtin_dynamic_object_size (ptr, type, counted_by field) >>>>>>> >>>>>>> Then the correct data dependence should be represented well in the IR. >>>>>>> >>>>>>> **During object size phase, >>>>>>> >>>>>>> The call to __builtin_dynamic_object_size will become an expression >>>>>>> includes the counted_by field or -1/0 when we cannot decide the size, >>>>>>> the correct data dependence will be kept even the call to >>>>>>> __builtin_dynamic_object_size is gone. >>>>>> >>>>>> But the whole point of the BOS pass is to derive information that is not >>>>>> available at parsing time, and that’s the cases you are after. The case >>>>>> where the connection to the field with the length is apparent during >>>>>> parsing is easy - you simply insert a load of the value before the BOS >>>>>> call. >>>>> >>>>> Yes, this is true. >>>>> I prefer to implement this in gimplification phase since I am more >>>>> familiar with the code there.. (I think that implementing it in >>>>> gimplification should be very similar as implementing it in FE? Or do I >>>>> miss anything here?) >>>>> >>>>> Joseph, if implement this in FE, where in the FE I should look at? >>>>> >>>> >>>> We should aim for a good integration with the BDOS pass, so >>>> that it can propagate the information further, e.g. the >>>> following should work: >>>> >>>> struct { int L; char buf[] __counted_by(L) } x; >>>> x.L = N; >>>> x.buf = ...; >>>> char *p = &x->f; >>>> __bdos(p) -> N >>>> >>>> So we need to be smart on how we provide the size >>>> information for x->f to the backend. >>> >>> To follow up on this. I do not think we should change the >>> builtin in the FE or gimplification. Instead, we want >>> to change the field access and compute the size there. >> Could you please clarify on this? What do you mean by >> "change the field access and compute the size there”? > > I think the FE should essentially give the > type > > char [buf.L] > > to buf.x; > > If the type (or its size) could be preserved > at this point so that it can be later > discovered by __bdos, then it could know > the size and propagate it further.
Currently, we already store the size info x.L of x.buf into the attribute list of the field_decl of “x.buf”, __bdos readily to use it without any issue. Putting “x.L” into TYPE of x.buf is the other approach, make it into a language extension. So, Do you mean to implement the attribute similar as the language extension now? i.e, convert the “attribute” info into the TYPE system at FE, then middle end will only use the TYPE info, not the attribute anymore? > > For the attribute, this is not exactly what > the FE could do because the semantic type > can not change, but this is roughly the idea. So, the attribute still cannot be put into the regular TYPE system at FE, we need to come up with new field in the current TYPE system to Carry such info? Then what’s the benefit from this new field in the TYPE system to my current approach (the attribute list of the field_decl)? Can this new approach resolve the reordering issue? > > >>> >>> In my toy patch I then made this have a VLA type that >>> encodes the size. Here, this would need to be done >>> differently. >>> >>> But still, what we are missing in both cases >>> is a proper way to pass the information down to BDOS. >> >> What’ s the issue with adding a new argument (x.L) to the BDOS call? What’s >> missing with this approach? >> > > See the example above. the BDOS call might come much > later when the relationship of the pointer to the > field access is no longer there. Why the relationship of the pointer to the field access is no longer there in _BDOS call in the above example? My understanding is that the relationship still there, that is recorded in the attribute list of the field_decl of the structure TYPE. BDOS call can access such information without any issue. I tried to come up with a small testing case with your above example, but failed with a compilation error. #include <stdint.h> #include <malloc.h> struct annotated { size_t L; char buf[] __attribute__((counted_by (L))); }; int main () { struct annotated x; x.L = 10; x.buf = (char *) malloc (x.L * sizeof (char)); char *p = &(x.buf); size_t size = __builtin_dynamic_object_size (p, 1); printf("the size of q is %lu \n", size); return 0; } /home/opc/Install/latest-d/bin/gcc -O3 t4.c t4.c: In function ‘main’: t4.c:13:9: error: invalid use of flexible array member 13 | x.buf = (char *) malloc (x.L * sizeof (char)); | ^ t4.c:14:13: warning: initialization of ‘char *’ from incompatible pointer type ‘char (*)[]’ [-Wincompatible-pointer-types] 14 | char *p = &(x.buf); | ^ Could you please provide me a working testing case for this? On the other hand, the following small testing case works without any issue with my GCC: #include <stdint.h> #include <malloc.h> struct annotated { size_t foo; char array[] __attribute__((counted_by (foo))); }; #define noinline __attribute__((__noinline__)) static struct annotated * noinline alloc_buf (int index) { struct annotated *p; p = malloc(sizeof (*p) + (index) * sizeof (char)); return p; } int main () { size_t size = 0; struct annotated *p = alloc_buf (10); p->foo = 10; char *q = p->array; size = __builtin_dynamic_object_size (q, 1); printf("the size of q is %lu \n", size); return 0; } [opc@qinzhao-ol8u3-x86 Sid]$ sh t /home/opc/Install/latest-d/bin/gcc -O3 t3.c the size of q is 10 > >>> >>> For VLAs this works because BDOS can see the size of >>> the definition. For calls to allocation functions >>> it is read from an attribute. >> >> You mean for VLA, BDOS see the size of the definition >> from the attribute for the allocation function? >> Yes, that’s the case for VLA. > > Ok, I am wrong about how it works for VLAs. They > get transformed to an alloca. > > But all calls marked with alloc_size and other > allocations functions are detected in BDOS. Yes. Qing > > >> >> For VLA, the size computation and storage allocation are all done by the >> compiler (through “gimplify_vla_decl” in gimplification phase), >> So these two can be tied together by the compiler. >> >> However, for FMA with counted_by attribute, the >> storage allocation and the counted_by assignment >> are done by the user. > > Yes. > > Martin > >> >> Qing >>> >>> But I am not sure what would be the best way to encode >>> this information so that BDOS can later access it. >>> >>> Martin >>> >>> >>> >>> >>>> >>>> This would also be desirable for the language extension. >>>> >>>> Martin >>>> >>>> >>>>> Thanks a lot for the help. >>>>> >>>>> Qing >>>>> >>>>>> For the late case there’s no way to invent data flow dependence without >>>>>> inadvertently pessimizing optimization. >>>>>> >>>>>> Richard >>>>>> >>>>>>> >>>>>>>> >>>>>>>> A related issue is that assignment to the field and storage allocation >>>>>>>> are not tied together >>>>>>> >>>>>>> Yes, this is different from VLA, in which, the size assignment and the >>>>>>> storage allocation are generated and tied together by the compiler. >>>>>>> >>>>>>> For the flexible array member, the storage allocation and the size >>>>>>> assignment are all done by the user. So, We need to clarify such >>>>>>> requirement in the document to guide user to write correct code. And >>>>>>> also, we might need to provide tools (warnings and sanitizer option) to >>>>>>> help users to catch such coding error. >>>>>>> >>>>>>>> - if there's no use of the size data we might >>>>>>>> remove the store of it as dead. >>>>>>> >>>>>>> Yes, when __bdos cannot decide the size, we need to remove the dead >>>>>>> store to the field. >>>>>>> I guess that the compiler should be able to do this automatically? >>>>>>> >>>>>>> thanks. >>>>>>> >>>>>>> Qing >>>>>>>> >>>>>>>> Of course I guess __bos then behaves like sizeof (). >>>>>>>> >>>>>>>> Richard. >>>>>>>> >>>>>>>>> >>>>>>>>> Qing >>>>>>>>> >>>>>>>>>> >>>>>>>>>> It may not work for something like this though: >>>>>>>>>> >>>>>>>>>> static size_t >>>>>>>>>> get_size_of (void *ptr) >>>>>>>>>> { >>>>>>>>>> return __bdos (ptr, 1); >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> void >>>>>>>>>> foo (size_t sz) >>>>>>>>>> { >>>>>>>>>> array_annotated = __builtin_malloc (sz); >>>>>>>>>> array_annotated = sz; >>>>>>>>>> >>>>>>>>>> ... >>>>>>>>>> __builtin_printf ("%zu\n", get_size_of (array_annotated->foo)); >>>>>>>>>> ... >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> because the call to get_size_of () may not have been inlined that >>>>>>>>>> early. >>>>>>>>>> >>>>>>>>>> The more fool-proof alternative may be to put a compile time barrier >>>>>>>>>> right below the assignment to array_annotated->foo; I reckon you >>>>>>>>>> could do that early in the front end by marking the size identifier >>>>>>>>>> and then tracking assignments to that identifier. That may have a >>>>>>>>>> slight runtime performance overhead since it may prevent even >>>>>>>>>> legitimate reordering. I can't think of another alternative at the >>>>>>>>>> moment... >>>>>>>>>> >>>>>>>>>> Sid >> >