> On Apr 11, 2025, at 14:24, Martin Uecker <uec...@tugraz.at> wrote: > > Am Freitag, dem 11.04.2025 um 18:14 +0000 schrieb Qing Zhao: >> >>> On Apr 11, 2025, at 13:37, Martin Uecker <uec...@tugraz.at> wrote: >>> >>> Am Freitag, dem 11.04.2025 um 17:08 +0000 schrieb Qing Zhao: >>>> >>>>> On Apr 11, 2025, at 12:20, Martin Uecker <uec...@tugraz.at> wrote: >>>>> >>>>> Am Freitag, dem 11.04.2025 um 16:01 +0000 schrieb Qing Zhao: >>>>>> >>>>>>> On Apr 11, 2025, at 10:53, Martin Uecker <uec...@tugraz.at> wrote: >>>>>>> >>>>>>> Am Freitag, dem 11.04.2025 um 10:42 -0400 schrieb Andrew MacLeod: >>>>>>>> On 4/11/25 10:27, Qing Zhao wrote: >>>>>>>>> >>>>>>>>>> On Apr 10, 2025, at 11:12, Martin Uecker <uec...@tugraz.at> wrote: >>>>>>>>>> >>>>>>>>>> Am Donnerstag, dem 10.04.2025 um 10:55 -0400 schrieb Siddhesh >>>>>>>>>> Poyarekar: >>>>>>>>>>> On 2025-04-10 10:50, Andrew MacLeod wrote: >>>>>>>>>>>> Its not clear to me exactly what is being asked, but I think the >>>>>>>>>>>> suggestion is that pointer references are being replaced with a >>>>>>>>>>>> builtin >>>>>>>>>>>> function called .ACCESS_WITH_SIZE ? and I presume that builtin >>>>>>>>>>>> function has some parameters that give you relevant range >>>>>>>>>>>> information of >>>>>>>>>>>> some sort? >>>>>>>>>>> Added, not replaced, but yes, that's essentially it. >>>>>>>>>>> >>>>>>>>>>>> range-ops is setup to pull range information from builtin functions >>>>>>>>>>>> already in gimple-range-op.cc:: >>>>>>>>>>>> gimple_range_op_handler::maybe_builtin_call (). We'd just need to >>>>>>>>>>>> write >>>>>>>>>>>> a handler for this new one. You can pull information from 2 >>>>>>>>>>>> operands >>>>>>>>>>>> under normal circumstances, but exceptions can be made. I'd >>>>>>>>>>>> need a >>>>>>>>>>>> description of what it looks like and how that translates to range >>>>>>>>>>>> info. >>>>>>>>>>> That's perfect! It's probably redundant for cases where we end up >>>>>>>>>>> with >>>>>>>>>>> both .ACCESS_WITH_SIZE and a __bos/__bdos call, but I don't >>>>>>>>>>> remember if >>>>>>>>>>> that's the only place where .ACCESS_WITH_SIZE is generated today. >>>>>>>>>>> Qing, >>>>>>>>>>> could you please work with Andrew on this? >>>>>>>>>> BTW, what I would find very interesting is inserting such information >>>>>>>>>> at the points where arrays decay to pointer. >>>>>>>>> Is the following the example? >>>>>>>>> >>>>>>>>> 1 #include <stdio.h> >>>>>>>>> 2 >>>>>>>>> 3 void foo (int arr[]) { >>>>>>>>> 4 // Inside the function, arr is treated as a pointer >>>>>>>>> 5 arr[6] = 10; >>>>>>>>> 6 } >>>>>>>>> 7 >>>>>>>>> 8 int my_array[5] = {10, 20, 30, 40, 50}; >>>>>>>>> 9 >>>>>>>>> 10 int main() { >>>>>>>>> 11 my_array[6] = 6; >>>>>>>>> 12 int *ptr = my_array; // Array decays to pointer here >>>>>>>>> 13 ptr[7] = 7; >>>>>>>>> 14 foo (my_array); >>>>>>>>> 15 16 return 0; >>>>>>>>> 17 } >>>>>>>>> >>>>>>>>> When I use the latest gcc to compile the above with -Warray-bounds: >>>>>>>>> >>>>>>>>> []$ gcc -O2 -Warray-bounds t.c >>>>>>>>> t.c: In function ‘main’: >>>>>>>>> t.c:13:6: warning: array subscript 7 is outside array bounds of >>>>>>>>> ‘int[5]’ [-Warray-bounds=] >>>>>>>>> 13 | ptr[7] = 7; >>>>>>>>> | ~~~^~~ >>>>>>>>> t.c:8:5: note: at offset 28 into object ‘my_array’ of size 20 >>>>>>>>> 8 | int my_array[5] = {10, 20, 30, 40, 50}; >>>>>>>>> | ^~~~~~~~ >>>>>>>>> In function ‘foo’, >>>>>>>>> inlined from ‘main’ at t.c:14:3: >>>>>>>>> t.c:5:10: warning: array subscript 6 is outside array bounds of >>>>>>>>> ‘int[5]’ [-Warray-bounds=] >>>>>>>>> 5 | arr[6] = 10; >>>>>>>>> | ~~~~~~~^~~~ >>>>>>>>> t.c: In function ‘main’: >>>>>>>>> t.c:8:5: note: at offset 24 into object ‘my_array’ of size 20 >>>>>>>>> 8 | int my_array[5] = {10, 20, 30, 40, 50}; >>>>>>>>> | ^~~~~~~~ >>>>>>>>> >>>>>>>>> Looks like that even after the array decay to pointer, the bound >>>>>>>>> information is still carried >>>>>>>>> for the decayed pointer somehow (I guess that vrp did this?) >>>>>>>> >>>>>>>> No, the behaviour in these warnings is from something else. Although >>>>>>>> some range info from VRP is used, most of this is tracked by the >>>>>>>> pointer_query (pointer-query.cc) mechanism that was written a number >>>>>>>> of >>>>>>>> years ago before ranger was completed. It attempts to do its own >>>>>>>> custom >>>>>>>> tracking of pointers and what they point to and the size of things >>>>>>>> they >>>>>>>> access. >>>>>>>> >>>>>>>> There are issues with that code, and the goal is to replace it with >>>>>>>> rangers prange. Alas there is enhancement work to prange for that to >>>>>>>> happen as it doesnt currently track and points to info. That would >>>>>>>> then >>>>>>>> be followed by converting the warning code to then use ranger/VRP >>>>>>>> instead. >>>>>>>> >>>>>>>> Any any adjustments to ranger for this are unlikely to affect anything >>>>>>>> until that work is done, and I do not think anyone is equipped to >>>>>>>> attempt to update the existing pointer-query code. >>>>>>>> >>>>>>>> Unfortunately :-( >>>>>>> >>>>>>> Examples I have in mind for the .ACCESS_WITH_SIZE are the >>>>>>> following two: >>>>>>> >>>>>>> struct foo { >>>>>>> >>>>>>> char arr[3]; >>>>>>> int b; >>>>>>> }; >>>>>>> >>>>>>> void f(struct foo x) >>>>>>> { >>>>>>> char *ptr = x.arr; >>>>>>> ptr[4] = 10; >>>>>>> } >>>>>> >>>>>> The above is an example about decaying a field array of a structure to a >>>>>> pointer. >>>>>> >>>>>> Yes, usually tracking and keeping the bound info for a field is harder >>>>>> than a regular variable, >>>>>> However, I think it’s still possible to improve compiler analysis to do >>>>>> this since the original bound >>>>>> info is in the code. >>>>>> >>>>>>> >>>>>>> void g(char (*arr)[4]) >>>>>>> { >>>>>>> char *ptr = *arr; >>>>>>> ptr[4] = 1; >>>>>>> } >>>>>>> >>>>>> >>>>>> The above example is about decaying a formal parameter array to a >>>>>> pointer. >>>>>> I think that since the bound information is in the code too, the current >>>>>> compiler analysis should be >>>>>> able to be improved to catch such case. >>>>>> >>>>>> For the above two cases, the current compiler analysis is not able to >>>>>> propagate the bound information, >>>>>> But since the bound info already in the code, it’s possible to improve >>>>>> the current compiler analysis to >>>>>> propagate such information more aggressively. >>>>>> >>>>>> Inserting call to .ACCESS_WITH_SIZE for such cases is not necessary. >>>>>> >>>>>> .ACCESS_WITH_SIZE is necessary for the cases that the size information >>>>>> is not available in the source >>>>>> Code, i.e., the cases that we need add attribute to specify the size of >>>>>> the access, for example, counted_by >>>>>> attribute, access attribute, etc. >>>>> >>>>> When you add an attribute, the information is also in the source. >>>>> >>>>> .ACCESS_WITH_SIZE can be used to make semantic information >>>>> from a type or attribute available to other passes, so it >>>>> seems the right tool to be used here. >>>>> >>>> >>>> If I remember correctly, the main purpose of adding .ACCESS_WITH_SIZE is >>>> to explicitly add the reference to >>>> the size field into the data flow in order to avoid any incorrect code >>>> reordering fro happening. >>>> >>>> Even without .ACCESS_WITH_SIZE, compiler phases that need the size >>>> information can get it from the IR. >>>> >>>> My understanding is that such issue with the implicit data flow dependency >>>> information missing is only for the >>>> counted_by attribute, not for the other TYPE which already have the bound >>>> information there. >>>> >>> >>> The dependency issue is only for the size, but for >>> other types the size information is often not >>> preserved, so then not available later. >>> .ACCESS_WITH_SIZE would solve this. >> >> Yes, I see your points here: >> >> The original size information from the array will be passed through >> the 2nd parameter of .ACCESS_WITH_SIZE to the decayed pointer. -:) >> >> However, I am not sure how much more benefit this can bring compare >> to improve the compiler analysis to better tracking the bound information >> for pointers >> through data-flow. > > My understand is that it would be very difficult and require a lot > of changes when one wanted to preserve the size expressions in > array types so that later passes can still access them. The > easiest solution would be to add an internal function at the > points arrays decay to pointers, which explicitely stores the size > for later use. Since .ACCESS_WITH_SIZE does exactly this for other > purposes, I would seem natural to reuse it. But it is quote > possible that I miss something important.
I will study this in more details to see any potential issue there or not. Thanks a lot for your comments and suggestions Qing > > Martin >