https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89337

--- Comment #9 from Martin Sebor <msebor at gcc dot gnu.org> ---
The warning is very simple: it just looks for excessive sizes in calls emitted
in the optimized IL.  When the call is there (either because it's in the source
code as is or because it's been synthesized by GCC based on the invariants it
can infer from the code) it triggers.  It runs after all high-level
optimizations, including DCE, and assumes that if the statement is in the IL it
is reachable.  Compiling the test case with -fdump-tree-optimized=/dev/stdout
shows the GIMPLE the warning works with:

  <bb 20> [local count: 233860936]:
  # iftmp.6_113 = PHI <iftmp.6_112(19), &D.3395.u.internal.str(18)>
  memset (iftmp.6_113, 0, 18446744073709551613);

I think the issue can essentially be reduced to the following:

$ cat z.c && gcc -O2 -S -fdump-tree-optimized=/dev/stdout z.c
void f (char *d, const char *s)
{
  if (__builtin_strstr (s, "ABC"))
    {
      __SIZE_TYPE__ n = __builtin_strlen (s) - 3;

      if (n > __builtin_strlen (s))   // cannot be true
        __builtin_memset (d, 0, n - __builtin_strlen (s));
  }
}

;; Function f (f, funcdef_no=0, decl_uid=1907, cgraph_uid=1, symbol_order=0)

Removing basic block 6
Removing basic block 7
f (char * d, const char * s)
{
  char * _1;
  long unsigned int _2;

  <bb 2> [local count: 1073741824]:
  _1 = __builtin_strstr (s_5(D), "ABC");
  if (_1 != 0B)
    goto <bb 3>; [70.00%]
  else
    goto <bb 5>; [30.00%]

  <bb 3> [local count: 751619278]:
  _2 = __builtin_strlen (s_5(D));
  if (_2 <= 2)
    goto <bb 4>; [33.00%]
  else
    goto <bb 5>; [67.00%]

  <bb 4> [local count: 248034361]:
  __builtin_memset (d_6(D), 0, 18446744073709551613); [tail call]

  <bb 5> [local count: 1073741824]:
  return;

}


z.c: In function ‘f’:
z.c:8:9: warning: ‘__builtin_memset’ specified size 18446744073709551613
exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
    8 |         __builtin_memset (d, 0, n - __builtin_strlen (s));
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

GCC doesn't have the smarts to figure out that when s contains a substring of
length 3 then strlen(s) must be at least 3.  As a result, it doesn't eliminate
the memset call in the function and the warning triggers.  Suppose we teach GCC
how to figure this out from the strstr call (which might be a useful
optimization) and then someone comes along with a test case that uses strspn()
instead of strstr().  We can also teach GCC about strstr() but then someone
else might use regcomp() or some user-defined function that we won't have
access to.  At some point we'll have to call it good enough.

It's helpful to keep in mind that no control flow or data flow-based warning is
free of false positives (or negatives), either in GCC or in any static
analyzer.  Analyzing only provably reachable code would mean not diagnosing
bugs in translation units without main because we cannot prove that they're
ever called.  Similarly, at the function level, it would mean not diagnosing
bugs in conditional statements unless the conditions could be proven to be true
at compile time.  That would make not only the false positive rate nearly zero,
it would also make the true positive rate nearly zero. In effect, it would make
the diagnostics virtually useless, able to detect only bugs in the simplest
code.  If you find a non-zero rate of false positives unacceptable then my
suggestion is to turn warnings off.  Not just this one but all others as most
have some false positives, including the front-ends ones that have no notion of
reachability.  Otherwise, if you accept that some false positives are
unavoidable and worth the checking GCC does, then until the strstr optimization
above is implemented, I would recommend to make use of __builtin_unreachable():

  void trim_asterisk(seastar::sstring &name) {
    if (name.find("ABC") != seastar::sstring::npos) {
      if (name.length () < 3)
        __builtin_unreachable ();
      name.resize(name.length() - 3);
    }
  }

Chances are that besides avoiding the warning it will also improve the quality
of the object code.  When hidden behind a well-named macro it might also
improve readability.

Reply via email to