https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86265
--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> --- On Mon, 25 Jun 2018, msebor at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86265 > > --- Comment #6 from Martin Sebor <msebor at gcc dot gnu.org> --- > The strlen range optimization doesn't take advantage of undefined behavior -- > like all other optimizations, it simply assumes code is free of it. > > I have two goals for the warnings I work on: a) most important is to find bugs > in user code, and b) less important is to drive improvements to help GCC > better > analyze source code and emit more efficient object code. > > By relying on valid calls to strcpy() writing only into the destination array > and not beyond, and reading only from the source array and not beyond, GCC can > safely assume that other members of the same struct or other elements of the > same array of structs than the one written to are unchanged by the strcpy() > call. For instance, in the following, the tests can safely be eliminated: > > struct A { > char a[4]; > int i; > }; > > void f (struct A *a) > { > int i = a[0].i + a[1].i; > > __builtin_strcpy (a[0].a, a[1].a); > > if (i != a[0].i + a[1].i) > __builtin_abort (); > } > > There is no reason not to take advantage of this except to cater to > exceptionally poorly written (and I'd say exceedingly rare) code, and thus > penalize the overwhelming majority of code that doesn't violate the basic > rules > of the language. The "basic rules of the language" are hard to understand. Not only because GCC chooses to apply them selectively (not to mem* but to str*). As GCC developer I'm more concerned about applying rules of the C language to the GIMPLE IL without much consideration. I guess exceptions for mem* also exist because other FEs _do_ emit calls to those functions as part of their IL lowering to GENERIC. Now I would fully expect they do the same for at least a subset of str* simply because when strings are a first-level entity in a programming language and you are using the C runtime you have no choice but using them. Until now the GIMPLE IL rule was that ADDR_EXPR expressions are just address computations and you are not allowed to make any assumptions about the address use based on the structure of a component reference appearing inside it. This very rule made it possible to aggressively propagate into address-computations but at the same time restricted propagation into dereferences. That str* argument ADDR_EXPRs now carry semantic value (same issue applies to the personally "much loved" __builtin_object_size) is disturbing. This basically means we may not propagate into address computations as much as we do? That is, enforcing C language rules to do more optimization without evaluating what (invalid under those C language rules) transforms GCC does itself lead to wrong-code bugs in the past. Doing that for diagnostics only can only lead to false positive diagnostics...