https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82946

--- Comment #7 from rguenther at suse dot de <rguenther at suse dot de> ---
On Thu, 16 Nov 2017, msebor at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82946
> 
> --- Comment #6 from Martin Sebor <msebor at gcc dot gnu.org> ---
> (In reply to rguent...@suse.de from comment #5)
> > This means you can very well replace memcpy with strcpy if you know
> > there's a '\0' in and only in the right place.
> 
> Sure, except when dealing with a string literal we know that the source is a
> string literal and not a pointer representation disguised as a sequence of
> bytes.  The optimization I'm referring to is specifically for string literals:
> 
>   unsigned g (struct A *a)
>   {
>     strcpy (a->d, "123");   // here we have a literal, not the representation
> of a pointer 
>     return strlen (a->d);   // a->d must be a valid pointer
>   }
> 
> > We certainly have to treat literal pointers encoded in any form
> > conservatively.  I don't see how they are against any standard.  There's
> > other clearly "valid" optimizations missing in GCC that look more
> > important to implement.
> 
> The C and C++ standards are clear as to what are valid pointer values and how
> they can come about.  Copying the representation from an arbitrary constant of
> an incompatible type into a pointer object is certainly not one of them.  
> I.e.,
> this:
> 
>   const char a[] = "123";
>   char *p;
>   memcpy (&p, a, sizeof p);
>   strlen (p);
> 
> is undefined, but this is of course valid:
> 
>   const char a[4] = "123";
>   char *p;
>   char *q = a;
>   memcpy (&p, &q, sizeof p);
>   strlen (p);
> 
> because it just copies the representation of what's known to be a valid 
> pointer
> value into another pointer object of a compatible type.
> 
> The point is that the bytes of no string literal can also be a valid pointer
> value, even if it happens to have the same representation as one, and this can
> be exploited to allow the optimization above.  It will not invalidate any
> correct programs.  It would be not only invalid but downright silly for a
> program to represent valid addresses as string literals.  Embedded programs of
> course do hardcode pointer values, but not in string literals: they hardcode
> them as integers, e.g.,
> 
>   void *my_register = (void*)0x123;
> 
> but never like so:
> 
>   char my_register[] = "123";

Ok.  I guess I have some patches somewhere that "properly" distinguish
string literals during points-to analysis which might help this case.
Or maybe not.

As with the other cases I have a hard time to imagine how to implement
and to transfer such knowledge to the alias-oracle / IL.

Reply via email to