> On Apr 11, 2025, at 14:24, Martin Uecker <uec...@tugraz.at> wrote:
> 
> Am Freitag, dem 11.04.2025 um 18:14 +0000 schrieb Qing Zhao:
>> 
>>> On Apr 11, 2025, at 13:37, Martin Uecker <uec...@tugraz.at> wrote:
>>> 
>>> Am Freitag, dem 11.04.2025 um 17:08 +0000 schrieb Qing Zhao:
>>>> 
>>>>> On Apr 11, 2025, at 12:20, Martin Uecker <uec...@tugraz.at> wrote:
>>>>> 
>>>>> Am Freitag, dem 11.04.2025 um 16:01 +0000 schrieb Qing Zhao:
>>>>>> 
>>>>>>> On Apr 11, 2025, at 10:53, Martin Uecker <uec...@tugraz.at> wrote:
>>>>>>> 
>>>>>>> Am Freitag, dem 11.04.2025 um 10:42 -0400 schrieb Andrew MacLeod:
>>>>>>>> On 4/11/25 10:27, Qing Zhao wrote:
>>>>>>>>> 
>>>>>>>>>> On Apr 10, 2025, at 11:12, Martin Uecker <uec...@tugraz.at> wrote:
>>>>>>>>>> 
>>>>>>>>>> Am Donnerstag, dem 10.04.2025 um 10:55 -0400 schrieb Siddhesh 
>>>>>>>>>> Poyarekar:
>>>>>>>>>>> On 2025-04-10 10:50, Andrew MacLeod wrote:
>>>>>>>>>>>> Its not clear to me exactly what is being asked, but I think the
>>>>>>>>>>>> suggestion is that pointer references are being replaced with a 
>>>>>>>>>>>> builtin
>>>>>>>>>>>> function called .ACCESS_WITH_SIZE ?    and I presume that builtin
>>>>>>>>>>>> function has some parameters that give you relevant range 
>>>>>>>>>>>> information of
>>>>>>>>>>>> some sort?
>>>>>>>>>>> Added, not replaced, but yes, that's essentially it.
>>>>>>>>>>> 
>>>>>>>>>>>> range-ops is setup to pull range information from builtin functions
>>>>>>>>>>>> already in gimple-range-op.cc::
>>>>>>>>>>>> gimple_range_op_handler::maybe_builtin_call ().  We'd just need to 
>>>>>>>>>>>> write
>>>>>>>>>>>> a handler for this new one.  You can pull information from 2 
>>>>>>>>>>>> operands
>>>>>>>>>>>> under normal circumstances, but exceptions can be made.    I'd 
>>>>>>>>>>>> need a
>>>>>>>>>>>> description of what it looks like and how that translates to range 
>>>>>>>>>>>> info.
>>>>>>>>>>> That's perfect!  It's probably redundant for cases where we end up 
>>>>>>>>>>> with
>>>>>>>>>>> both .ACCESS_WITH_SIZE and a __bos/__bdos call, but I don't 
>>>>>>>>>>> remember if
>>>>>>>>>>> that's the only place where .ACCESS_WITH_SIZE is generated today.  
>>>>>>>>>>> Qing,
>>>>>>>>>>> could you please work with Andrew on this?
>>>>>>>>>> BTW, what I would find very interesting is inserting such information
>>>>>>>>>> at the points where arrays decay to pointer.
>>>>>>>>> Is the following the example?
>>>>>>>>> 
>>>>>>>>> 1 #include <stdio.h>
>>>>>>>>> 2
>>>>>>>>> 3 void foo (int arr[]) {
>>>>>>>>> 4   // Inside the function, arr is treated as a pointer
>>>>>>>>> 5   arr[6] = 10;
>>>>>>>>> 6 }
>>>>>>>>> 7
>>>>>>>>> 8 int my_array[5] = {10, 20, 30, 40, 50};
>>>>>>>>> 9
>>>>>>>>> 10 int main() {
>>>>>>>>> 11   my_array[6] = 6;
>>>>>>>>> 12   int *ptr = my_array; // Array decays to pointer here
>>>>>>>>> 13   ptr[7] = 7;
>>>>>>>>> 14   foo (my_array);
>>>>>>>>> 15   16   return 0;
>>>>>>>>> 17 }
>>>>>>>>> 
>>>>>>>>> When I use the latest gcc to compile the above with -Warray-bounds:
>>>>>>>>> 
>>>>>>>>> []$ gcc -O2 -Warray-bounds t.c
>>>>>>>>> t.c: In function ‘main’:
>>>>>>>>> t.c:13:6: warning: array subscript 7 is outside array bounds of 
>>>>>>>>> ‘int[5]’ [-Warray-bounds=]
>>>>>>>>> 13 |   ptr[7] = 7;
>>>>>>>>>    |   ~~~^~~
>>>>>>>>> t.c:8:5: note: at offset 28 into object ‘my_array’ of size 20
>>>>>>>>>  8 | int my_array[5] = {10, 20, 30, 40, 50};
>>>>>>>>>    |     ^~~~~~~~
>>>>>>>>> In function ‘foo’,
>>>>>>>>>  inlined from ‘main’ at t.c:14:3:
>>>>>>>>> t.c:5:10: warning: array subscript 6 is outside array bounds of 
>>>>>>>>> ‘int[5]’ [-Warray-bounds=]
>>>>>>>>>  5 |   arr[6] = 10;
>>>>>>>>>    |   ~~~~~~~^~~~
>>>>>>>>> t.c: In function ‘main’:
>>>>>>>>> t.c:8:5: note: at offset 24 into object ‘my_array’ of size 20
>>>>>>>>>  8 | int my_array[5] = {10, 20, 30, 40, 50};
>>>>>>>>>    |     ^~~~~~~~
>>>>>>>>> 
>>>>>>>>> Looks like that even after the array decay to pointer, the bound 
>>>>>>>>> information is still carried
>>>>>>>>> for the decayed pointer somehow (I guess that vrp did this?)
>>>>>>>> 
>>>>>>>> No, the behaviour in these warnings is from something else. Although 
>>>>>>>> some range info from VRP is used, most of this is tracked by the 
>>>>>>>> pointer_query (pointer-query.cc) mechanism that was written a number 
>>>>>>>> of 
>>>>>>>> years ago before ranger was completed.  It attempts to do its own 
>>>>>>>> custom 
>>>>>>>> tracking of pointers and what they point to and the size of things 
>>>>>>>> they 
>>>>>>>> access.
>>>>>>>> 
>>>>>>>> There are issues with that code, and the goal is to replace it with 
>>>>>>>> rangers prange.  Alas there is enhancement work to prange for that to 
>>>>>>>> happen as it doesnt currently track and points to info. That would 
>>>>>>>> then 
>>>>>>>> be followed by converting the warning code to then use ranger/VRP 
>>>>>>>> instead.
>>>>>>>> 
>>>>>>>> Any any adjustments to ranger for this are unlikely to affect anything 
>>>>>>>> until that work is done, and I do not think anyone is equipped to 
>>>>>>>> attempt to update the existing pointer-query code.
>>>>>>>> 
>>>>>>>> Unfortunately :-(
>>>>>>> 
>>>>>>> Examples I have in mind for the .ACCESS_WITH_SIZE are the
>>>>>>> following two:
>>>>>>> 
>>>>>>> struct foo {
>>>>>>> 
>>>>>>> char arr[3];
>>>>>>> int b;
>>>>>>> };
>>>>>>> 
>>>>>>> void f(struct foo x)
>>>>>>> {
>>>>>>> char *ptr = x.arr;
>>>>>>> ptr[4] = 10;
>>>>>>> }
>>>>>> 
>>>>>> The above is an example about decaying a field array of a structure to a 
>>>>>> pointer. 
>>>>>> 
>>>>>> Yes, usually tracking and keeping the bound info for a field is harder 
>>>>>> than a regular variable,
>>>>>> However, I think it’s still possible to improve compiler analysis to do 
>>>>>> this since the original bound
>>>>>> info is in the code. 
>>>>>> 
>>>>>>> 
>>>>>>> void g(char (*arr)[4])
>>>>>>> {
>>>>>>> char *ptr = *arr;
>>>>>>> ptr[4] = 1;
>>>>>>> }
>>>>>>> 
>>>>>> 
>>>>>> The above example is about decaying a formal parameter array to a 
>>>>>> pointer. 
>>>>>> I think that since the bound information is in the code too, the current 
>>>>>> compiler analysis should be
>>>>>> able to be improved to catch such case. 
>>>>>> 
>>>>>> For the above two cases, the current compiler analysis is not able to 
>>>>>> propagate the bound information,
>>>>>> But since the bound info already in the code, it’s possible to improve 
>>>>>> the current compiler analysis to 
>>>>>> propagate such information more aggressively. 
>>>>>> 
>>>>>> Inserting call to .ACCESS_WITH_SIZE for such cases is not necessary. 
>>>>>> 
>>>>>> .ACCESS_WITH_SIZE is necessary for the cases that the size information 
>>>>>> is not available in the source
>>>>>> Code, i.e., the cases that we need add attribute to specify the size of 
>>>>>> the access, for example, counted_by
>>>>>> attribute, access attribute, etc.
>>>>> 
>>>>> When you add an attribute, the information is also in the source.
>>>>> 
>>>>> .ACCESS_WITH_SIZE can be used to make  semantic information
>>>>> from a type or attribute available to other passes, so it
>>>>> seems the right tool to be used here. 
>>>>> 
>>>> 
>>>> If I remember correctly, the main purpose of adding .ACCESS_WITH_SIZE is 
>>>> to explicitly add the reference to
>>>> the size field into the data flow in order to avoid any incorrect code 
>>>> reordering fro happening.  
>>>> 
>>>> Even without .ACCESS_WITH_SIZE, compiler phases that need the size 
>>>> information can get it from the IR. 
>>>> 
>>>> My understanding is that such issue with the implicit data flow dependency 
>>>> information missing is only for the
>>>> counted_by attribute, not for the other TYPE which already have the bound 
>>>> information there.
>>>> 
>>> 
>>> The dependency issue is only for the size, but for
>>> other types the size information is often not
>>> preserved, so then not available later. 
>>> .ACCESS_WITH_SIZE would solve this.
>> 
>> Yes, I see your points here:
>> 
>> The original size information from the array will be passed through
>> the 2nd parameter of .ACCESS_WITH_SIZE to the decayed pointer. -:)
>> 
>> However, I am not sure how much more benefit this can bring compare 
>> to improve the compiler analysis to better tracking the bound information 
>> for pointers
>> through data-flow.
> 
> My understand is that it would be very difficult and require a lot
> of changes when one wanted to preserve the size expressions in
> array types so that later passes can still access them.  The
> easiest solution  would be to add an internal function at the
> points arrays decay to pointers, which explicitely stores the size
> for later use.   Since .ACCESS_WITH_SIZE does exactly this for other
> purposes, I would seem natural to reuse it.  But it is quote
> possible that I miss something important.

I will study this in more details to see any potential issue there or not.

Thanks a lot for your comments and suggestions

Qing
> 
> Martin
> 

Reply via email to