Hi, 

Some update on the v3 of the patch after more study and discussion.

1.  There were more discussion among Kees, Bill, I and also Apple engineers on 
whether we should support 
      “counted_by” on pointers with type “void *”.  And finally we agreed to:

      A. Do not support the “counted_by” on pointers with type “void *”, issue 
errors when “counted_by” is attached
           to a pointer with “void *”;
      B. For such pointers with type “void *”, a future new attribute 
“sized_by” will be provided. 

      The major reason for this decision is: (from Yeoul Na, apple engineer):

      "By definition, with `__counted_by(N)`, the pointer points to memory that 
contains N elements of pointee type 
       (see 
https://clang.llvm.org/docs/BoundsSafety.html#external-bounds-annotations). And 
there’s no such thing 
        as elements of void; that’s why, we can’t do `void arr[10]`. So `void 
*__counted_by(N)` feels just wrong”

       And I agree with her on this argument. 
        I think it’s better to provide a “sized_by” attribute for “void *” 
pointers. 

        Let me know if you have a different opinion on this decision.

2. I also studied further on the implementation of the array bounds sanitizer 
for pointers with counted_by attribute.

    I tried the other approach I thought it might be better than my current 
implementation:

For the following:

struct annotated {
 int b;
 int *c __attribute__ ((counted_by (b)));
} *p_array_annotated;

p_array_annotated->c[annotated_index] = 2;

generate ARRAY_REF instead of INDIRECT_REF for the above 
p_array_annotated->c[annotated_index]
in C FE.  then the INDEX info was kept nicely in the IR when getting to the 
bound sanitizer  instrumentation
phase, and all the hacks that try to get the index from the OFFSET computation 
expression are avoided.

However, there are two major issues with this approach:
      A. Now the ARRAY_REF might include an  operand with POINTER TYPE (it was 
assumed to 
           be ARRAY_TYPE by default), all the corresponding routines that have 
such assumption need to be
           adjusted (for example, debug_generic_expr does not work anymore, I 
believe that more other routines
            Need to be adjusted).

      B. When converting the ARRAY_REF to INDIRECT_REF after the bound 
instrumentation for the pointer array,
           the current routines cannot handle the “COMPOUND_EXPR” that wrapping 
the  index and instrumented code
           Correctly. All the corresponding routines need to be adjusted to 
specially handle the following COMPOUND_EXPR:
   
            (.UBSAN_BOUNDS (0B, SAVE_EXPR <annotated_index>, (sizetype) 
MAX_EXPR <MEM <int> [(void *)&*p_array_annotated], 0>), SAVE_EXPR 
<annotated_index>)

So, I think that this approach is even worse than my current one. 

    I’d like to keep the current implementation for bound sanitizer. 

    Let me know if you have a different opinion on this.


I will modify my v3 patch to include the above 2, and send out the v4 after 
that.

thanks.

Qing
              
            


> On Apr 30, 2025, at 08:49, Qing Zhao <qing.z...@oracle.com> wrote:
> 
> Hi,
> 
> This is the 3rd version of the patch set to extend "counted_by" attribute
> to pointer fields of structures.
> 
> compared to the 2nd version:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681727.html
> https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681728.html
> https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681729.html
> https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681730.html
> 
> The major change is:
> 
> "The counted_by attribute is allowed for a void pointer field, the element
> size of such pointer array is assumed as size 1."
> 
> both __builtin_dynamic_object_size and bounds sanitizer handle this.
> 
> This patch set includes 3 parts:
> 
> 1.Extend "counted_by" attribute to pointer fields of structures. 
> 2.Convert a pointer reference with counted_by attribute to .ACCESS_WITH_SIZE
>    and use it in builtinin-object-size.
> 3.Use the counted_by attribute of pointers in array bound checker.
> 
> In which, the patch 1 and 2 are simple and straightforward, however, the 
> patch 3  
> is a little complicate due to the following reason:
> 
>    Current array bound checker only instruments ARRAY_REF, and the INDEX
>    information is the 2nd operand of the ARRAY_REF.
> 
>    When extending the array bound checker to pointer references with
>    counted_by attributes, the hardest part is to get the INDEX of the
>    corresponding array ref from the offset computation expression of
>    the pointer ref. 
> 
> So, the patch #3 is a RFC: I do need some comments and suggestions on it.
> And I do wonder for the access to pointer arrays:
> 
> struct annotated {
>  int b;
>  int *c __attribute__ ((counted_by (b)));
> } *p_array_annotated;
> 
> p_array_annotated->c[annotated_index] = 2;
> 
> Is it possible to generate ARRAY_REF instead of INDIRECT_REF for the above 
> p_array_annotated->c[annotated_index]
> in C FE? then we can keep the INDEX info in the IR and avoid all the hacks 
> to get the index from the OFFSET computation expression.
> 
> The whole patch set has been rebased on the latest trunk, bootstrapped 
> and regression tested on both aarch64 and x86.
> 
> Please let me know whether the patch 1 and patch 2 are ready for thunk?
> and any suggestions and comments on the patch 3?
> 
> Thanks a lot.
> 
> Qing
> 
> ========================================================
> 
> 
> the first version was submitted 3 months ago on 1/16/2025, and triggered
> a lot of discussion on whether we need a new syntax for counted_by
> attribute.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673837.html
> 
> After a long discussion since then: 
> (https://gcc.gnu.org/pipermail/gcc-patches/2025-March/677024.html)
> 
> We agreed to the following compromised solution:
> 
> 1. Keep the current syntax of counted_by for lone identifier;
> 2. Add a new attribute "counted_by_exp" for expressions.
> 
> Although there are still some discussion going on for the new 
> counted_by_exp attribute (In Clang community) 
> https://discourse.llvm.org/t/rfc-bounds-safety-in-c-syntax-compatibility-with-gcc/85885
> 
> The syntax for the lone identifier is kept the same as before.
> 
> So, I'd like to resubmit my previous patch of extending "counted_by"
> to pointer fields of structures. 
> 
> The whole patch set has been rebased on the latest trunk, some testing case
> adjustment,  bootstrapped  and regression tested on both aarch64 and x86.
> 
> There will be a seperate patch set for the new "counted_by_exp" 
> attribute later to cover the expressions cases.
> 
> The following are more details on this patch set:
> 
> For example:
> 
> struct PP {
>  size_t count2;
>  char other1;
>  char *array2 __attribute__ ((counted_by (count2)));
>  int other2;
> } *pp;
> 
> specifies that the "array2" is an array that is pointed by the
> pointer field, and its number of elements is given by the field
> "count2" in the same structure.
> 
> There are the following importand facts about "counted_by" on pointer
> fields compared to the "counted_by" on FAM fields:
> 
> 1. one more new requirement for pointer fields with "counted_by" attribute:
>   pp->array2 and pp->count2 can ONLY be changed by changing the whole 
> structure
>   at the same time.
> 
> 2. the following feature for FAM field with "counted_by" attribute is NOT
>   valid for the pointer field any more:
> 
>    " One important feature of the attribute is, a reference to the
>     flexible array member field uses the latest value assigned to the
>     field that represents the number of the elements before that
>     reference.  For example,
> 
>            p->count = val1;
>            p->array[20] = 0;  // ref1 to p->array
>            p->count = val2;
>            p->array[30] = 0;  // ref2 to p->array
> 
>     in the above, 'ref1' uses 'val1' as the number of the elements in
>     'p->array', and 'ref2' uses 'val2' as the number of elements in
>     'p->array'. "

Reply via email to