Re: .ACCESS_WITH_SIZE and pointer ranges

Qing Zhao Fri, 11 Apr 2025 10:08:43 -0700


> On Apr 11, 2025, at 12:20, Martin Uecker <[email protected]> wrote:
> 
> Am Freitag, dem 11.04.2025 um 16:01 +0000 schrieb Qing Zhao:
>> 
>>> On Apr 11, 2025, at 10:53, Martin Uecker <[email protected]> wrote:
>>> 
>>> Am Freitag, dem 11.04.2025 um 10:42 -0400 schrieb Andrew MacLeod:
>>>> On 4/11/25 10:27, Qing Zhao wrote:
>>>>> 
>>>>>> On Apr 10, 2025, at 11:12, Martin Uecker <[email protected]> wrote:
>>>>>> 
>>>>>> Am Donnerstag, dem 10.04.2025 um 10:55 -0400 schrieb Siddhesh Poyarekar:
>>>>>>> On 2025-04-10 10:50, Andrew MacLeod wrote:
>>>>>>>> Its not clear to me exactly what is being asked, but I think the
>>>>>>>> suggestion is that pointer references are being replaced with a builtin
>>>>>>>> function called .ACCESS_WITH_SIZE ?    and I presume that builtin
>>>>>>>> function has some parameters that give you relevant range information 
>>>>>>>> of
>>>>>>>> some sort?
>>>>>>> Added, not replaced, but yes, that's essentially it.
>>>>>>> 
>>>>>>>> range-ops is setup to pull range information from builtin functions
>>>>>>>> already in gimple-range-op.cc::
>>>>>>>> gimple_range_op_handler::maybe_builtin_call ().  We'd just need to 
>>>>>>>> write
>>>>>>>> a handler for this new one.  You can pull information from 2 operands
>>>>>>>> under normal circumstances, but exceptions can be made.    I'd need a
>>>>>>>> description of what it looks like and how that translates to range 
>>>>>>>> info.
>>>>>>> That's perfect!  It's probably redundant for cases where we end up with
>>>>>>> both .ACCESS_WITH_SIZE and a __bos/__bdos call, but I don't remember if
>>>>>>> that's the only place where .ACCESS_WITH_SIZE is generated today.  Qing,
>>>>>>> could you please work with Andrew on this?
>>>>>> BTW, what I would find very interesting is inserting such information
>>>>>> at the points where arrays decay to pointer.
>>>>> Is the following the example?
>>>>> 
>>>>>  1 #include <stdio.h>
>>>>>  2
>>>>>  3 void foo (int arr[]) {
>>>>>  4   // Inside the function, arr is treated as a pointer
>>>>>  5   arr[6] = 10;
>>>>>  6 }
>>>>>  7
>>>>>  8 int my_array[5] = {10, 20, 30, 40, 50};
>>>>>  9
>>>>> 10 int main() {
>>>>> 11   my_array[6] = 6;
>>>>> 12   int *ptr = my_array; // Array decays to pointer here
>>>>> 13   ptr[7] = 7;
>>>>> 14   foo (my_array);
>>>>> 15   16   return 0;
>>>>> 17 }
>>>>> 
>>>>> When I use the latest gcc to compile the above with -Warray-bounds:
>>>>> 
>>>>> []$ gcc -O2 -Warray-bounds t.c
>>>>> t.c: In function ‘main’:
>>>>> t.c:13:6: warning: array subscript 7 is outside array bounds of ‘int[5]’ 
>>>>> [-Warray-bounds=]
>>>>>   13 |   ptr[7] = 7;
>>>>>      |   ~~~^~~
>>>>> t.c:8:5: note: at offset 28 into object ‘my_array’ of size 20
>>>>>    8 | int my_array[5] = {10, 20, 30, 40, 50};
>>>>>      |     ^~~~~~~~
>>>>> In function ‘foo’,
>>>>>    inlined from ‘main’ at t.c:14:3:
>>>>> t.c:5:10: warning: array subscript 6 is outside array bounds of ‘int[5]’ 
>>>>> [-Warray-bounds=]
>>>>>    5 |   arr[6] = 10;
>>>>>      |   ~~~~~~~^~~~
>>>>> t.c: In function ‘main’:
>>>>> t.c:8:5: note: at offset 24 into object ‘my_array’ of size 20
>>>>>    8 | int my_array[5] = {10, 20, 30, 40, 50};
>>>>>      |     ^~~~~~~~
>>>>> 
>>>>> Looks like that even after the array decay to pointer, the bound 
>>>>> information is still carried
>>>>> for the decayed pointer somehow (I guess that vrp did this?)
>>>> 
>>>> No, the behaviour in these warnings is from something else. Although 
>>>> some range info from VRP is used, most of this is tracked by the 
>>>> pointer_query (pointer-query.cc) mechanism that was written a number of 
>>>> years ago before ranger was completed.  It attempts to do its own custom 
>>>> tracking of pointers and what they point to and the size of things they 
>>>> access.
>>>> 
>>>> There are issues with that code, and the goal is to replace it with 
>>>> rangers prange.  Alas there is enhancement work to prange for that to 
>>>> happen as it doesnt currently track and points to info. That would then 
>>>> be followed by converting the warning code to then use ranger/VRP instead.
>>>> 
>>>> Any any adjustments to ranger for this are unlikely to affect anything 
>>>> until that work is done, and I do not think anyone is equipped to 
>>>> attempt to update the existing pointer-query code.
>>>> 
>>>> Unfortunately :-(
>>> 
>>> Examples I have in mind for the .ACCESS_WITH_SIZE are the
>>> following two:
>>> 
>>> struct foo {
>>> 
>>>   char arr[3];
>>>   int b;
>>> };
>>> 
>>> void f(struct foo x)
>>> {
>>>   char *ptr = x.arr;
>>>   ptr[4] = 10;
>>> }
>> 
>> The above is an example about decaying a field array of a structure to a 
>> pointer. 
>> 
>> Yes, usually tracking and keeping the bound info for a field is harder than 
>> a regular variable,
>> However, I think it’s still possible to improve compiler analysis to do this 
>> since the original bound
>> info is in the code. 
>> 
>>> 
>>> void g(char (*arr)[4])
>>> {
>>>   char *ptr = *arr;
>>>   ptr[4] = 1;
>>> }
>>> 
>> 
>> The above example is about decaying a formal parameter array to a pointer. 
>> I think that since the bound information is in the code too, the current 
>> compiler analysis should be
>> able to be improved to catch such case. 
>> 
>> For the above two cases, the current compiler analysis is not able to 
>> propagate the bound information,
>> But since the bound info already in the code, it’s possible to improve the 
>> current compiler analysis to 
>> propagate such information more aggressively. 
>> 
>> Inserting call to .ACCESS_WITH_SIZE for such cases is not necessary. 
>> 
>> .ACCESS_WITH_SIZE is necessary for the cases that the size information is 
>> not available in the source
>> Code, i.e., the cases that we need add attribute to specify the size of the 
>> access, for example, counted_by
>> attribute, access attribute, etc. 
> 
> When you add an attribute, the information is also in the source.
> 
> .ACCESS_WITH_SIZE can be used to make  semantic information
> from a type or attribute available to other passes, so it
> seems the right tool to be used here. 
>


If I remember correctly, the main purpose of adding .ACCESS_WITH_SIZE is to 
explicitly add the reference to
the size field into the data flow in order to avoid any incorrect code 
reordering fro happening.  

Even without .ACCESS_WITH_SIZE, compiler phases that need the size information 
can get it from the IR. 

My understanding is that such issue with the implicit data flow dependency 
information missing is only for the
counted_by attribute, not for the other TYPE which already have the bound 
information there.

Qing
> 
> Martin
> 
>> 
>> thanks.
>> 
>> Qing
>> 
>>> 
>>> Martin

Re: .ACCESS_WITH_SIZE and pointer ranges

Reply via email to