Am Freitag, dem 11.04.2025 um 18:14 +0000 schrieb Qing Zhao:
> 
> > On Apr 11, 2025, at 13:37, Martin Uecker <uec...@tugraz.at> wrote:
> > 
> > Am Freitag, dem 11.04.2025 um 17:08 +0000 schrieb Qing Zhao:
> > > 
> > > > On Apr 11, 2025, at 12:20, Martin Uecker <uec...@tugraz.at> wrote:
> > > > 
> > > > Am Freitag, dem 11.04.2025 um 16:01 +0000 schrieb Qing Zhao:
> > > > > 
> > > > > > On Apr 11, 2025, at 10:53, Martin Uecker <uec...@tugraz.at> wrote:
> > > > > > 
> > > > > > Am Freitag, dem 11.04.2025 um 10:42 -0400 schrieb Andrew MacLeod:
> > > > > > > On 4/11/25 10:27, Qing Zhao wrote:
> > > > > > > > 
> > > > > > > > > On Apr 10, 2025, at 11:12, Martin Uecker <uec...@tugraz.at> 
> > > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > Am Donnerstag, dem 10.04.2025 um 10:55 -0400 schrieb Siddhesh 
> > > > > > > > > Poyarekar:
> > > > > > > > > > On 2025-04-10 10:50, Andrew MacLeod wrote:
> > > > > > > > > > > Its not clear to me exactly what is being asked, but I 
> > > > > > > > > > > think the
> > > > > > > > > > > suggestion is that pointer references are being replaced 
> > > > > > > > > > > with a builtin
> > > > > > > > > > > function called .ACCESS_WITH_SIZE ?    and I presume that 
> > > > > > > > > > > builtin
> > > > > > > > > > > function has some parameters that give you relevant range 
> > > > > > > > > > > information of
> > > > > > > > > > > some sort?
> > > > > > > > > > Added, not replaced, but yes, that's essentially it.
> > > > > > > > > > 
> > > > > > > > > > > range-ops is setup to pull range information from builtin 
> > > > > > > > > > > functions
> > > > > > > > > > > already in gimple-range-op.cc::
> > > > > > > > > > > gimple_range_op_handler::maybe_builtin_call ().  We'd 
> > > > > > > > > > > just need to write
> > > > > > > > > > > a handler for this new one.  You can pull information 
> > > > > > > > > > > from 2 operands
> > > > > > > > > > > under normal circumstances, but exceptions can be made.   
> > > > > > > > > > >  I'd need a
> > > > > > > > > > > description of what it looks like and how that translates 
> > > > > > > > > > > to range info.
> > > > > > > > > > That's perfect!  It's probably redundant for cases where we 
> > > > > > > > > > end up with
> > > > > > > > > > both .ACCESS_WITH_SIZE and a __bos/__bdos call, but I don't 
> > > > > > > > > > remember if
> > > > > > > > > > that's the only place where .ACCESS_WITH_SIZE is generated 
> > > > > > > > > > today.  Qing,
> > > > > > > > > > could you please work with Andrew on this?
> > > > > > > > > BTW, what I would find very interesting is inserting such 
> > > > > > > > > information
> > > > > > > > > at the points where arrays decay to pointer.
> > > > > > > > Is the following the example?
> > > > > > > > 
> > > > > > > > 1 #include <stdio.h>
> > > > > > > > 2
> > > > > > > > 3 void foo (int arr[]) {
> > > > > > > > 4   // Inside the function, arr is treated as a pointer
> > > > > > > > 5   arr[6] = 10;
> > > > > > > > 6 }
> > > > > > > > 7
> > > > > > > > 8 int my_array[5] = {10, 20, 30, 40, 50};
> > > > > > > > 9
> > > > > > > > 10 int main() {
> > > > > > > > 11   my_array[6] = 6;
> > > > > > > > 12   int *ptr = my_array; // Array decays to pointer here
> > > > > > > > 13   ptr[7] = 7;
> > > > > > > > 14   foo (my_array);
> > > > > > > > 15   16   return 0;
> > > > > > > > 17 }
> > > > > > > > 
> > > > > > > > When I use the latest gcc to compile the above with 
> > > > > > > > -Warray-bounds:
> > > > > > > > 
> > > > > > > > []$ gcc -O2 -Warray-bounds t.c
> > > > > > > > t.c: In function ‘main’:
> > > > > > > > t.c:13:6: warning: array subscript 7 is outside array bounds of 
> > > > > > > > ‘int[5]’ [-Warray-bounds=]
> > > > > > > >  13 |   ptr[7] = 7;
> > > > > > > >     |   ~~~^~~
> > > > > > > > t.c:8:5: note: at offset 28 into object ‘my_array’ of size 20
> > > > > > > >   8 | int my_array[5] = {10, 20, 30, 40, 50};
> > > > > > > >     |     ^~~~~~~~
> > > > > > > > In function ‘foo’,
> > > > > > > >   inlined from ‘main’ at t.c:14:3:
> > > > > > > > t.c:5:10: warning: array subscript 6 is outside array bounds of 
> > > > > > > > ‘int[5]’ [-Warray-bounds=]
> > > > > > > >   5 |   arr[6] = 10;
> > > > > > > >     |   ~~~~~~~^~~~
> > > > > > > > t.c: In function ‘main’:
> > > > > > > > t.c:8:5: note: at offset 24 into object ‘my_array’ of size 20
> > > > > > > >   8 | int my_array[5] = {10, 20, 30, 40, 50};
> > > > > > > >     |     ^~~~~~~~
> > > > > > > > 
> > > > > > > > Looks like that even after the array decay to pointer, the 
> > > > > > > > bound information is still carried
> > > > > > > > for the decayed pointer somehow (I guess that vrp did this?)
> > > > > > > 
> > > > > > > No, the behaviour in these warnings is from something else. 
> > > > > > > Although 
> > > > > > > some range info from VRP is used, most of this is tracked by the 
> > > > > > > pointer_query (pointer-query.cc) mechanism that was written a 
> > > > > > > number of 
> > > > > > > years ago before ranger was completed.  It attempts to do its own 
> > > > > > > custom 
> > > > > > > tracking of pointers and what they point to and the size of 
> > > > > > > things they 
> > > > > > > access.
> > > > > > > 
> > > > > > > There are issues with that code, and the goal is to replace it 
> > > > > > > with 
> > > > > > > rangers prange.  Alas there is enhancement work to prange for 
> > > > > > > that to 
> > > > > > > happen as it doesnt currently track and points to info. That 
> > > > > > > would then 
> > > > > > > be followed by converting the warning code to then use ranger/VRP 
> > > > > > > instead.
> > > > > > > 
> > > > > > > Any any adjustments to ranger for this are unlikely to affect 
> > > > > > > anything 
> > > > > > > until that work is done, and I do not think anyone is equipped to 
> > > > > > > attempt to update the existing pointer-query code.
> > > > > > > 
> > > > > > > Unfortunately :-(
> > > > > > 
> > > > > > Examples I have in mind for the .ACCESS_WITH_SIZE are the
> > > > > > following two:
> > > > > > 
> > > > > > struct foo {
> > > > > > 
> > > > > >  char arr[3];
> > > > > >  int b;
> > > > > > };
> > > > > > 
> > > > > > void f(struct foo x)
> > > > > > {
> > > > > >  char *ptr = x.arr;
> > > > > >  ptr[4] = 10;
> > > > > > }
> > > > > 
> > > > > The above is an example about decaying a field array of a structure 
> > > > > to a pointer. 
> > > > > 
> > > > > Yes, usually tracking and keeping the bound info for a field is 
> > > > > harder than a regular variable,
> > > > > However, I think it’s still possible to improve compiler analysis to 
> > > > > do this since the original bound
> > > > > info is in the code. 
> > > > > 
> > > > > > 
> > > > > > void g(char (*arr)[4])
> > > > > > {
> > > > > >  char *ptr = *arr;
> > > > > >  ptr[4] = 1;
> > > > > > }
> > > > > > 
> > > > > 
> > > > > The above example is about decaying a formal parameter array to a 
> > > > > pointer. 
> > > > > I think that since the bound information is in the code too, the 
> > > > > current compiler analysis should be
> > > > > able to be improved to catch such case. 
> > > > > 
> > > > > For the above two cases, the current compiler analysis is not able to 
> > > > > propagate the bound information,
> > > > > But since the bound info already in the code, it’s possible to 
> > > > > improve the current compiler analysis to 
> > > > > propagate such information more aggressively. 
> > > > > 
> > > > > Inserting call to .ACCESS_WITH_SIZE for such cases is not necessary. 
> > > > > 
> > > > > .ACCESS_WITH_SIZE is necessary for the cases that the size 
> > > > > information is not available in the source
> > > > > Code, i.e., the cases that we need add attribute to specify the size 
> > > > > of the access, for example, counted_by
> > > > > attribute, access attribute, etc. 
> > > > 
> > > > When you add an attribute, the information is also in the source.
> > > > 
> > > > .ACCESS_WITH_SIZE can be used to make  semantic information
> > > > from a type or attribute available to other passes, so it
> > > > seems the right tool to be used here. 
> > > > 
> > > 
> > > If I remember correctly, the main purpose of adding .ACCESS_WITH_SIZE is 
> > > to explicitly add the reference to
> > > the size field into the data flow in order to avoid any incorrect code 
> > > reordering fro happening.  
> > > 
> > > Even without .ACCESS_WITH_SIZE, compiler phases that need the size 
> > > information can get it from the IR. 
> > > 
> > > My understanding is that such issue with the implicit data flow 
> > > dependency information missing is only for the
> > > counted_by attribute, not for the other TYPE which already have the bound 
> > > information there.
> > > 
> > 
> > The dependency issue is only for the size, but for
> > other types the size information is often not
> > preserved, so then not available later. 
> > .ACCESS_WITH_SIZE would solve this.
> 
> Yes, I see your points here:
> 
> The original size information from the array will be passed through
>  the 2nd parameter of .ACCESS_WITH_SIZE to the decayed pointer. -:)
> 
> However, I am not sure how much more benefit this can bring compare 
> to improve the compiler analysis to better tracking the bound information for 
> pointers
> through data-flow. 

My understand is that it would be very difficult and require a lot
of changes when one wanted to preserve the size expressions in
array types so that later passes can still access them.  The
easiest solution  would be to add an internal function at the
points arrays decay to pointers, which explicitely stores the size
for later use.   Since .ACCESS_WITH_SIZE does exactly this for other
purposes, I would seem natural to reuse it.  But it is quote
possible that I miss something important.

Martin

Reply via email to