> On Jan 29, 2024, at 10:50 AM, Martin Uecker <[email protected]> wrote:
>
> Am Montag, dem 29.01.2024 um 15:09 +0000 schrieb Qing Zhao:
>> Thank you!
>>
>> Joseph and Richard, could you also comment on this?
>>
>>> On Jan 28, 2024, at 5:09 AM, Martin Uecker <[email protected]> wrote:
>>>
>>> Am Freitag, dem 26.01.2024 um 14:33 +0000 schrieb Qing Zhao:
>>>>
>>>>> On Jan 26, 2024, at 3:04 AM, Martin Uecker <[email protected]> wrote:
>>>>>
>>>>>
>>>>> I haven't looked at the patch, but it sounds you give the result
>>>>> the wrong type. Then patching up all use cases instead of the
>>>>> type seems wrong.
>>>>
>>>> Yes, this is for resolving a very early gimplification issue as I reported
>>>> last Nov:
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
>>>>
>>>> Since no-one responded at that time, I fixed the issue by replacing the
>>>> ARRAY_REF
>>>> With a pointer indirection:
>>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
>>>>
>>>> The reason for such change is: return a flexible array member TYPE is not
>>>> allowed
>>>> by C language (our gimplification follows this rule), so, we have to
>>>> return a pointer TYPE instead.
>>>>
>>>> ******The new internal function
>>>>
>>>> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE,
>>>> ACCESS_MODE, INDEX)
>>>>
>>>> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
>>>>
>>>> which returns the "REF_TO_OBJ" same as the 1st argument;
>>>>
>>>> Both the return type and the type of the first argument of this function
>>>> have been converted from
>>>> the incomplete array type to the corresponding pointer type.
>>>>
>>>> As a result, the original ARRAY_REF was converted to an INDIRECT_REF, the
>>>> original INDEX of the ARRAY_REF was lost
>>>> when converting from ARRAY_REF to INDIRECT_REF, in order to keep the INDEX
>>>> for bound sanitizer instrumentation, I added
>>>> The 6th argument “INDEX”.
>>>>
>>>> What’s your comment and suggestion on this solution?
>>>
>>> I am not entirely sure but changing types in the FE seems
>>> problematic because this breaks language semantics. And
>>> then adding special code everywhere to treat it specially
>>> in the FE does not seem a good way forward.
>>>
>>> If I understand correctly, returning an incomplete array
>>> type is not allowed and then fails during gimplification.
>>
>> Yes, this is the problem in gimplification.
>>
>>> So I would suggest to make it return a pointer to the
>>> incomplete array (and not the element type)
>>
>>
>> for the following:
>>
>> struct annotated {
>> unsigned int size;
>> int array[] __attribute__((counted_by (size)));
>> };
>>
>> struct annotated * p = ….
>> p->array[9] = 0;
>>
>> The IL for the above array reference p->array[9] is:
>>
>> 1. If the return type is the original incomplete array type,
>>
>> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
>>
>> (this triggered the gimplification failure since the return type cannot be a
>> complete type).
>>
>> 2. When the return type is changed to a pointer to the element type of the
>> incomplete array, (the current patch)
>> Then the original array reference naturally becomes an indirect reference
>> through the pointer
>>
>> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, 9) + 36) = 0;
>>
>> Since the original array reference becomes an indirect reference through the
>> pointer to the element array, the INDEX info
>> is mixed into the OFFSET of the indirect reference and lost, so, I added the
>> 6th argument to the routine .ACCESS_WITH_SIZE
>> to record the INDEX.
>>
>> 3. With your suggestion, the return type is changed to a pointer to the
>> incomplete array,
>> I just tried this to change the result type :
>>
>>
>> --- a/gcc/c/c-typeck.cc
>> +++ b/gcc/c/c-typeck.cc
>> @@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc,
>> tree ref,
>> tree counted_by_type)
>> {
>> gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
>> - tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
>> + tree result_type = build_pointer_type (TREE_TYPE (ref));
>>
>> Then, I got the following FE errors:
>>
>> test.c:10:11: error: invalid use of flexible array member
>> 10 | p->array[9] = 0;
>>
>> The reason for the error is: when the original array_ref becomes an
>> indirect_ref through the pointer to the incomplete array,
>> During the computation of the OFFSET to the pointer, the TYPE_SIZE_UNIT
>> (type) is invalid since the type is an incomplete array.
>> As a result, the OFFSET cannot computed for the indirect_ref.
>>
>> Looks like even more issues with this approach.
>
> Yes, but only because the following is missing:
>
>>
>>
>>> but then wrap
>>> it with an indirection when inserting this code in the FE
>>> so that the full replacement has the correct type again
>>> (of the incomplete array).
>>
>> I don’t quite understand the above, could you please explain this in more
>> details? (If possible, could you please use the above small example?)
>> thanks.
>
> You would need to add an indirection:
>
> (*(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)))[9] = 0;
>
> if .ACCESS_WITH_SIZE has type T (*)[], i.e. pointer to incomplete
> array of type T, then (*(.ACCESS_WITH_SIZE (...))) has type T[], i.e.
> incomplete array of type.
>
> And you shouldn't even consider array derefencing part at all,
> but replace the p->array when the component ref is constructed.
Thanks, I see now.
I just updated the routine “build_access_with_size_for_counted_by” as
following:
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -2619,7 +2619,7 @@ build_access_with_size_for_counted_by (location_t loc,
tree ref,
tree counted_by_type)
{
gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref)));
- tree result_type = build_pointer_type (TREE_TYPE (TREE_TYPE (ref)));
+ tree result_type = build_pointer_type (TREE_TYPE (ref));
unsigned int counted_by_precision = TYPE_PRECISION (counted_by_type);
tree call
@@ -2632,6 +2632,7 @@ build_access_with_size_for_counted_by (location_t loc,
tree ref,
counted_by_precision),
build_int_cst (integer_type_node, -1),
build_int_cst (integer_type_node, -1));
+ call = build1 (INDIRECT_REF, TREE_TYPE (ref), call);
SET_EXPR_LOCATION (call, loc);
return call;
}
This works for the small testing case, the generated IR is:
(*.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1, -1))[9] = 0;
I will do more testings and adjust my patch on this change.
Thanks a lot for your help.
Qing
>
> Martin
>
>
>
>>
>>>
>>>
>>> Alternatively, one could allow this during gimplification
>>> or add some conversion.
>>
>> Allowing this in gimplification might trigger some other issues. I guess
>> that adding conversion
>> in the end of the FE or in the beginning of gimplification might be better.
>>
>> i.e, in FE, still keep the original incomplete array type as the return
>> type for the routine .ACCESS_WITH_SIZE, i.e
>>
>> .ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1)[9] = 0;
>>
>> But add a conversion from the above array_ref to an indirect_ref in the end
>> of FE or in the beginning of gimplification:
>>
>> *(.ACCESS_WITH_SIZE ((int *) &p->array, &p->size, 1, 32, -1) + 36) = 0;
>>
>> With this approach, during FE, the original ARRAY TYPE and the INDEX can be
>> kept, the array bound sanitizer instrumentation
>> Will be much easier than my current approach.
>>
>> Is this approach reasonable?
>>
>> If so, where is better to add this conversion, in the end of FE or in the
>> beginning of gimplification?
>>
>> Thanks.
>>
>> Qing
>>
>>
>>>
>>> Martin
>>>
>>>
>>>>
>>>> Thanks.
>>>>
>>>> Qing
>>>>
>>>>
>>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>> Am Donnerstag, dem 25.01.2024 um 20:11 +0000 schrieb Qing Zhao:
>>>>>> Thanks a lot for the testing.
>>>>>>
>>>>>> Yes, I can repeat the issue with the following small example:
>>>>>>
>>>>>> #include <stdlib.h>
>>>>>> #include <stddef.h>
>>>>>> #include <stdio.h>
>>>>>>
>>>>>> #define MAX(a, b) ((a) > (b) ? (a) : (b))
>>>>>>
>>>>>> struct untracked {
>>>>>> int size;
>>>>>> int array[] __attribute__((counted_by (size)));
>>>>>> } *a;
>>>>>> struct untracked * alloc_buf (int index)
>>>>>> {
>>>>>> struct untracked *p;
>>>>>> p = (struct untracked *) malloc (MAX (sizeof (struct untracked),
>>>>>> (offsetof (struct untracked,
>>>>>> array[0])
>>>>>> + (index) * sizeof (int))));
>>>>>> p->size = index;
>>>>>> return p;
>>>>>> }
>>>>>>
>>>>>> int main()
>>>>>> {
>>>>>> a = alloc_buf(10);
>>>>>> printf ("same_type is %d\n",
>>>>>> (__builtin_types_compatible_p(typeof (a->array), typeof
>>>>>> (&(a->array)[0]))));
>>>>>> return 0;
>>>>>> }
>>>>>>
>>>>>>
>>>>>> /home/opc/Install/latest-d/bin/gcc -O2 btcp.c
>>>>>> same_type is 1
>>>>>>
>>>>>> Looks like that the “typeof” operator need to be handled specially in C
>>>>>> FE
>>>>>> for the new internal function .ACCESS_WITH_SIZE.
>>>>>>
>>>>>> (I have specially handle the operator “offsetof” in C FE already).
>>>>>>
>>>>>> Will fix this issue.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> Qing
>>>>>>
>>>>>>> On Jan 24, 2024, at 7:51 PM, Kees Cook <[email protected]> wrote:
>>>>>>>
>>>>>>> On Wed, Jan 24, 2024 at 12:29:51AM +0000, Qing Zhao wrote:
>>>>>>>> This is the 4th version of the patch.
>>>>>>>
>>>>>>> Thanks very much for this!
>>>>>>>
>>>>>>> I tripped over an unexpected behavioral change that the Linux kernel
>>>>>>> depends on:
>>>>>>>
>>>>>>> __builtin_types_compatible_p() no longer treats an array marked with
>>>>>>> counted_by as different from that array's decayed pointer. Specifically,
>>>>>>> the kernel uses these macros:
>>>>>>>
>>>>>>>
>>>>>>> /*
>>>>>>> * Force a compilation error if condition is true, but also produce a
>>>>>>> * result (of value 0 and type int), so the expression can be used
>>>>>>> * e.g. in a structure initializer (or where-ever else comma expressions
>>>>>>> * aren't permitted).
>>>>>>> */
>>>>>>> #define BUILD_BUG_ON_ZERO(e) ((int)(sizeof(struct { int:(-!!(e)); })))
>>>>>>>
>>>>>>> #define __same_type(a, b) __builtin_types_compatible_p(typeof(a),
>>>>>>> typeof(b))
>>>>>>>
>>>>>>> /* &a[0] degrades to a pointer: a different type from an array */
>>>>>>> #define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a),
>>>>>>> &(a)[0]))
>>>>>>>
>>>>>>>
>>>>>>> This gets used in various places to make sure we're dealing with an
>>>>>>> array for a macro:
>>>>>>>
>>>>>>> #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) +
>>>>>>> __must_be_array(arr))
>>>>>>>
>>>>>>>
>>>>>>> So this builds:
>>>>>>>
>>>>>>> struct untracked {
>>>>>>> int size;
>>>>>>> int array[];
>>>>>>> } *a;
>>>>>>>
>>>>>>> __must_be_array(a->array)
>>>>>>> => 0 (as expected)
>>>>>>> __builtin_types_compatible_p(typeof(a->array), typeof(&(a->array)[0]))
>>>>>>> => 0 (as expected, array vs decayed array pointer)
>>>>>>>
>>>>>>>
>>>>>>> But if counted_by is added, we get a build failure:
>>>>>>>
>>>>>>> struct tracked {
>>>>>>> int size;
>>>>>>> int array[] __counted_by(size);
>>>>>>> } *b;
>>>>>>>
>>>>>>> __must_be_array(b->array)
>>>>>>> => build failure (not expected)
>>>>>>> __builtin_types_compatible_p(typeof(b->array), typeof(&(b->array)[0]))
>>>>>>> => 1 (not expected, both pointers?)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Kees Cook
>>
>>
>
> --
> Univ.-Prof. Dr. rer. nat. Martin Uecker
> Graz University of Technology
> Institute of Biomedical Imaging