I'm working on ensuring that the GIMPLE semantics used by smtgcc are
correct, and I have a lot of questions about the details. I'll be sending
a series of emails with these questions. This first one is about pointers
in general.
Each section starts with a description of the semantics I've implemented
(or plan to implement), followed by concrete questions if relevant. Let me
know if the described semantics are incorrect or incomplete.
Pointers values
---------------
C imposes restrictions on pointer values (alignment, must point to valid
memory, etc.), but GIMPLE seems more permissive. For example, when
vectorizing a loop:
void foo(int *p, int n)
{
for (int i = 0; i < n; i++)
p[i] = 1;
}
the vectorized loop updates the pointer vectp_p.8 for the next iteration
at the end of the loop, causing it to point up to a vector-length out of
range after the last iteration:
<bb 3> :
# vectp_p.8_34 = PHI <vectp_p.8_35(6), p_8(D)(8)>
# ivtmp_37 = PHI <ivtmp_38(6), 0(8)>
MEM <vector(4) int> [(int *)vectp_p.8_34] = { 1, 1, 1, 1 };
vectp_p.8_35 = vectp_p.8_34 + 16;
ivtmp_38 = ivtmp_37 + 1;
if (ivtmp_38 < bnd.5_31)
goto <bb 6>;
else
goto <bb 10>;
<bb 6> :
goto <bb 3>;
Because of this, I allow all values -- essentially treating pointers as
unsigned integers. Things like ensuring pointers point to valid memory or
have correct alignment are only checked when dereferencing the pointer
(I'll discuss pointer dereferencing semantics in the next email).
Pointer arithmetic -- POINTER_PLUS_EXPR
---------------------------------------
POINTER_PLUS_EXPR calculates p + x:
* It is UB if the calculation overflows when
TYPE_OVERFLOW_WRAPS(ptr_type) is false.
* It is UB if p + x == 0, unless p == 0 and x == 0, unless
flag_delete_null_pointer_checks is false.
Since the pointer type is unsigned, it's not entirely clear what overflow
means. For example, p - 1 is represented as p + 0xffffffffffffffff, which
overflows when treated as an unsigned addition. I check for overflow in
p + x as follows:
is_overflow = ((intptr_t)x < 0) ? (p + x) > p : (p + x) < p;
Pointer arithmetic -- MEM_REF and TARGET_MEM_REF
------------------------------------------------
A calculation p + x can also be done using MEM_REF as &MEM[p + x].
Question: smtgcc does not check for overflow in address calculations for
MEM_REF and TARGET_MEM_REF. Should it treat overflow as UB in the same way
as POINTER_PLUS_EXPR?
Pointer arithmetic -- COMPONENT_REF
-----------------------------------
COMPONENT_REF adds the offset of a structure element to the pointer to the
structure.
Question: smtgcc does not check for overflow in address calculation for
COMPONENT_REF. Should it treat overflow as UB in the same way as
POINTER_PLUS_EXPR?
Pointer arithmetic -- ARRAY_REF
-------------------------------
ARRAY_REF adds the offset of the indexed element to the pointer to the
array.
* It is UB if the index is negative.
* If the TYPE_DOMAIN for the array type has a TYPE_MAX_VALUE and the
array is not a flexible array, it is UB if the index exceeds "one
after" this maximum value.
* If it is a flexible array or does not have a maximum value, it is
considered an unsized array, so all non-negative indices are valid.
But it is UB if the calculation of
offset = index * element_size
overflows.
Question: smtgcc does not check for overflow in the calculation of p +
offset. Should it treat overflow as UB in the same way as
POINTER_PLUS_EXPR?
Question: What is the correct way to check for flexible arrays? I am
currently using array_ref_flexible_size_p and flag_strict_flex_arrays, but
I noticed that middle-end passes do not seem to use
flag_strict_flex_arrays. Is there a more canonical way to determine how to
interpret flexible arrays?
Pointer arithmetic -- POINTER_DIFF_EXPR
---------------------------------------
Subtracting a pointer q from a pointer p is done using POINTER_DIFF_EXPR.
* It is UB if the difference does not fit in a signed integer with the
same precision as the pointers.
This implies that an object's size must be less than half the address
space; otherwise, POINTER_DIFF_EXPR cannot be used to compute sizes in C.
But there may be additional restrictions. For example, compiling the
function:
void foo(int *p, int *q, int n)
{
for (int i = 0; i < n; i++)
p[i] = q[i] + 1;
}
causes the vectorized code to perform overlap checks like:
_7 = q_11(D) + 4;
_25 = p_12(D) - _7;
_26 = (sizetype) _25;
_27 = _26 > 8;
_28 = _27;
if (_28 != 0)
goto <bb 11>;
else
goto <bb 12>;
which takes the difference between two pointers that may point to
different objects. This suggests that all objects must fit within half the
address space.
Question: What are the restrictions on valid address ranges and object
sizes?
Issues
------
The semantics I've described above result in many reports of
miscompilations that I haven't reported yet.
As mentioned earlier, the vectorizer can use POINTER_PLUS_EXPR to generate
pointers that extend up to a vector length past the object (and similarly,
up to one vector length before the object in a backwards loop). This is
invalid if the object is too close to address 0 or 0xffffffffffffffff.
And this out of bounds distance can be made arbitrarily large, as can be
seen by compiling the following function below for x86_64 with -O3:
void foo(int *p, uint64_t s, int64_t n)
{
for (int64_t i = 0; i < n; i++)
{
int *q = p + i * s;
for (int j = 0; j < 4; j++)
*q++ = j;
}
}
Here, the vectorized code add s * 4 to the pointer ivtmp_8 at the end of
each iteration:
<bb 5> :
bnd.8_38 = (unsigned long) n_10(D);
_21 = s_11(D) * 4;
<bb 3> :
# ivtmp_8 = PHI <ivtmp_7(6), p_12(D)(5)>
# ivtmp_26 = PHI <ivtmp_20(6), 0(5)>
MEM <vector(4) int> [(int *)ivtmp_8] = { 0, 1, 2, 3 };
ivtmp_7 = ivtmp_8 + _21;
ivtmp_20 = ivtmp_26 + 1;
if (ivtmp_20 < bnd.8_38)
goto <bb 6>;
else
goto <bb 7>;
<bb 6> :
goto <bb 3>;
This means calling foo with a sufficiently large s can guarantee wrapping
or evaluating to 0, even though the original IR before optimization did
not wrap or evaluating to 0. For example, when calling foo as:
foo(p, -((uintptr_t)p / 4), 1);
I guess this is essentially the same issue as PR113590, and could be fixed
by moving all induction variable updates to the latch. But if that
happens, the motivating examples for needing to handle invalid pointer
values would no longer occur. Would that mean smtgcc should adopt a more
restrictive semantics for pointer values?
/Krister