https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119140

            Bug ID: 119140
           Summary: Diagnose with -fsanitize={bounds,bounds-strict} taking
                    address of one past last element if it is obviously
                    dereferenced in the same function
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: sanitizer
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
                CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
                    jakub at gcc dot gnu.org, kcc at gcc dot gnu.org
  Target Milestone: ---

In PR119132, address of one past last element of an array is taken (not UB) but
then immediately used in the same function.
I wonder if we couldn't special case such trivial cases and diagnose even that.
Consider
struct S { int a; int b[3]; int c; };

int *
foo (struct S *s, int i)
{
  return &s->b[i];
}

int
bar (struct S *s, int i)
{
  int *p = &s->b[i];
  return *p;
}

Right now we diagnose if i < 0 or i > 3 when calling these functions.
For foo that is the desirable case, there is no UB.
For bar, there is no UB on the int *p = &s->b[i]; declaration, but there is on
*p.

We instrument the ARRAY_REFs (note, peter0x44 on IRC mentioned clang
instruments also &arr[0] + i; which we don't) early in the FEs, and
unfortunately the relationship between the .UBSAN_BOUNDS calls and the taking
of the address is lost, so e.g. at *.ssa time we have roughly
  i.0_2 = i_1(D);
  .UBSAN_BOUNDS (0B, i.0_2, 4);
  _6 = &s_5(D)->b[i.0_2];
in both functions.
So, perhaps for the case where we allow one past last element, we could turn
that into a pass through IFN, so have something like
  i.0_2 = i_1(D);
  _5 = &s_5(D)->b[i.0_2];
  _6 = .UBSAN_BOUNDS (0B, i.0_2, 4, _5);
and then (ideally early, so that it doesn't hinder optimizations, say in the
ubsan pass) try to look at .UBSAN_BOUNDS lhs uses (if any), and if any of them
is a dereference at offset from 0 to element_size - 1 and non-zero size, adjust
the .UBSAN_BOUNDS call into
  .UBSAN_BOUNDS (0B, i.0_2, 3);
  _6 = _5;
and otherwise keep the 4 in there.

Or should we kill some optimizations and defer the tests for dereferences (or
repeat them) soon after IPA to handle even inlining?

Thoughts on this?

Reply via email to