I'm working on ensuring that the GIMPLE semantics used by smtgcc are correct, and I have a lot of questions about the details. I'll be sending a series of emails with these questions. This first one is about pointers in general.

Each section starts with a description of the semantics I've implemented (or plan to implement), followed by concrete questions if relevant. Let me know if the described semantics are incorrect or incomplete.


Pointers values
---------------
C imposes restrictions on pointer values (alignment, must point to valid memory, etc.), but GIMPLE seems more permissive. For example, when vectorizing a loop:

  void foo(int *p, int n)
  {
    for (int i = 0; i < n; i++)
      p[i] = 1;
  }

the vectorized loop updates the pointer vectp_p.8 for the next iteration at the end of the loop, causing it to point up to a vector-length out of range after the last iteration:

  <bb 3> :
  # vectp_p.8_34 = PHI <vectp_p.8_35(6), p_8(D)(8)>
  # ivtmp_37 = PHI <ivtmp_38(6), 0(8)>
  MEM <vector(4) int> [(int *)vectp_p.8_34] = { 1, 1, 1, 1 };
  vectp_p.8_35 = vectp_p.8_34 + 16;
  ivtmp_38 = ivtmp_37 + 1;
  if (ivtmp_38 < bnd.5_31)
    goto <bb 6>;
  else
    goto <bb 10>;

  <bb 6> :
  goto <bb 3>;

Because of this, I allow all values -- essentially treating pointers as unsigned integers. Things like ensuring pointers point to valid memory or have correct alignment are only checked when dereferencing the pointer (I'll discuss pointer dereferencing semantics in the next email).


Pointer arithmetic -- POINTER_PLUS_EXPR
---------------------------------------
POINTER_PLUS_EXPR calculates p + x:
 * It is UB if the calculation overflows when
   TYPE_OVERFLOW_WRAPS(ptr_type) is false.
 * It is UB if p + x == 0, unless p == 0 and x == 0, unless
   flag_delete_null_pointer_checks is false.

Since the pointer type is unsigned, it's not entirely clear what overflow means. For example, p - 1 is represented as p + 0xffffffffffffffff, which overflows when treated as an unsigned addition. I check for overflow in p + x as follows:
  is_overflow = ((intptr_t)x < 0) ? (p + x) > p : (p + x) < p;


Pointer arithmetic -- MEM_REF and TARGET_MEM_REF
------------------------------------------------
A calculation p + x can also be done using MEM_REF as &MEM[p + x].

Question: smtgcc does not check for overflow in address calculations for MEM_REF and TARGET_MEM_REF. Should it treat overflow as UB in the same way as POINTER_PLUS_EXPR?


Pointer arithmetic -- COMPONENT_REF
-----------------------------------
COMPONENT_REF adds the offset of a structure element to the pointer to the structure.

Question: smtgcc does not check for overflow in address calculation for COMPONENT_REF. Should it treat overflow as UB in the same way as POINTER_PLUS_EXPR?


Pointer arithmetic -- ARRAY_REF
-------------------------------
ARRAY_REF adds the offset of the indexed element to the pointer to the array.

 * It is UB if the index is negative.
 * If the TYPE_DOMAIN for the array type has a TYPE_MAX_VALUE and the
   array is not a flexible array, it is UB if the index exceeds "one
   after" this maximum value.
 * If it is a flexible array or does not have a maximum value, it is
   considered an unsized array, so all non-negative indices are valid.
   But it is UB if the calculation of
      offset = index * element_size
   overflows.

Question: smtgcc does not check for overflow in the calculation of p + offset. Should it treat overflow as UB in the same way as POINTER_PLUS_EXPR?

Question: What is the correct way to check for flexible arrays? I am currently using array_ref_flexible_size_p and flag_strict_flex_arrays, but I noticed that middle-end passes do not seem to use flag_strict_flex_arrays. Is there a more canonical way to determine how to interpret flexible arrays?


Pointer arithmetic -- POINTER_DIFF_EXPR
---------------------------------------
Subtracting a pointer q from a pointer p is done using POINTER_DIFF_EXPR.
 * It is UB if the difference does not fit in a signed integer with the
   same precision as the pointers.

This implies that an object's size must be less than half the address space; otherwise, POINTER_DIFF_EXPR cannot be used to compute sizes in C. But there may be additional restrictions. For example, compiling the function:

  void foo(int *p, int *q, int n)
  {
    for (int i = 0; i < n; i++)
      p[i] = q[i] + 1;
  }

causes the vectorized code to perform overlap checks like:

  _7 = q_11(D) + 4;
  _25 = p_12(D) - _7;
  _26 = (sizetype) _25;
  _27 = _26 > 8;
  _28 = _27;
  if (_28 != 0)
    goto <bb 11>;
  else
    goto <bb 12>;

which takes the difference between two pointers that may point to different objects. This suggests that all objects must fit within half the address space.

Question: What are the restrictions on valid address ranges and object sizes?


Issues
------
The semantics I've described above result in many reports of miscompilations that I haven't reported yet.

As mentioned earlier, the vectorizer can use POINTER_PLUS_EXPR to generate pointers that extend up to a vector length past the object (and similarly, up to one vector length before the object in a backwards loop). This is invalid if the object is too close to address 0 or 0xffffffffffffffff.

And this out of bounds distance can be made arbitrarily large, as can be seen by compiling the following function below for x86_64 with -O3:

void foo(int *p, uint64_t s, int64_t n)
{
  for (int64_t i = 0; i < n; i++)
    {
      int *q = p + i * s;
      for (int j = 0; j < 4; j++)
        *q++ = j;
    }
}

Here, the vectorized code add s * 4 to the pointer ivtmp_8 at the end of each iteration:

  <bb 5> :
  bnd.8_38 = (unsigned long) n_10(D);
  _21 = s_11(D) * 4;

  <bb 3> :
  # ivtmp_8 = PHI <ivtmp_7(6), p_12(D)(5)>
  # ivtmp_26 = PHI <ivtmp_20(6), 0(5)>
  MEM <vector(4) int> [(int *)ivtmp_8] = { 0, 1, 2, 3 };
  ivtmp_7 = ivtmp_8 + _21;
  ivtmp_20 = ivtmp_26 + 1;
  if (ivtmp_20 < bnd.8_38)
    goto <bb 6>;
  else
    goto <bb 7>;

  <bb 6> :
  goto <bb 3>;

This means calling foo with a sufficiently large s can guarantee wrapping or evaluating to 0, even though the original IR before optimization did not wrap or evaluating to 0. For example, when calling foo as:

  foo(p, -((uintptr_t)p / 4), 1);

I guess this is essentially the same issue as PR113590, and could be fixed by moving all induction variable updates to the latch. But if that happens, the motivating examples for needing to handle invalid pointer values would no longer occur. Would that mean smtgcc should adopt a more restrictive semantics for pointer values?


   /Krister

Reply via email to