void* <-> char* aliasing rule, C standard or glitch?

2006-07-01 Thread Elmar Krieger

Hi GCClers,

I searched hard, but couldn't determine conclusively if the C standard 
allows to alias a void* pointer with a char* pointer.


If that's not undefined behavior, then the following may be a glitch in 
GCC 4.1.0 when compiled with -O2.


Here's the ugly minimal piece of code:

/* ASSUME pointer POINTS TO AN INTEGER THAT SPECIFIES THE INCREMENT
    */
int increment(int *pointer)
{ return(*pointer); }

/* INCREMENT A POINTER BY THE NUMBER OF BYTES POINTED TO
   = */
void *incremented(void *pointer)
{ int inc=increment(pointer);
  *((char**)&pointer)+=inc;
  return(pointer); }

What goes wrong is that the function incremented() increments the 
pointer, but returns the original, non-incremented value.


Here's the assembly code:

Dump of assembler code for function incremented:
0x08051f80 : push   %ebp
0x08051f81 : mov%esp,%ebp
0x08051f83 : push   %ebx
0x08051f84 : sub$0x4,%esp
0x08051f87 : mov0x8(%ebp),%ebx
0x08051f8a :mov%ebx,(%esp)
0x08051f8d :call   0x8051f70 
0x08051f92 :add%eax,0x8(%ebp)
0x08051f95 :mov%ebx,%eax
**
-->> Here's the problem, line above should read mov 0x8(%ebp),%eax
**
0x08051f97 :add$0x4,%esp
0x08051f9a :pop%ebx
0x08051f9b :pop%ebp
0x08051f9c :ret
0x08051f9d :lea0x0(%esi),%esi


Thanks for your time, and BTW: is there a 'better' construct for 
incrementing a void* pointer other than *((char**)&pointer)+=inc ? ;-)


Greetings,
Elmar


New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-08 Thread Elmar Krieger

Dear all,

while I fully understand that GCC's steadily advancing optimization 
capabilities can't be 'for free', the latest versions have become almost 
unusably slow for me:


With simple optimization -O, compiling a certain C source file (~6 
lines) now takes 4.5 minutes, while older GCCs did it in 14 seconds 
(that's 19x as long, requiring a cup of coffee after each code change). 
(Core i7 CPU).


For completeness, these are the GCC versions:

Slow with 264 seconds:

i686-linux-android-gcc (GCC) 4.6.x-google 20120106 (prerelease)
Copyright (C) 2011 Free Software Foundation, Inc.

Fast with 14 seconds:

i386-mingw32msvc-gcc (GCC) 3.2.3 (mingw special 20030504-1)
Copyright (C) 2002 Free Software Foundation, Inc.

The slowdown is not the same with other files, so I'm essentially sure 
that this specific source file has some 'feature' that catches GCC at 
the wrong leg. This raises my hopes that one of the GCC experts wants to 
take a look at it. The code is confidential, but if you agree on one 
expert to have a look, I'll provide it privately (please contact elmar 
_at_ cmbi.ru.nl). Could be related to the fact that this particular 
source file contains the application's main loop, a single large 
function with 28000 lines.



One other thing I just thought of: GCC has a history of very smart 
extensions to C that allow to write faster and more elegant code. If I 
look at my code, there are mostly two sources of 'dirty hacks' left. One 
that could be fixed easily is the 'void** pointer problem', that 
clutters my code with nasty explicit type casts:


A simple example is the function freesetnull, that frees a pointer and 
sets it NULL (ptradd is a pointer address):


void freesetnull(void **ptradd)
{ free(*ptradd);
  *ptradd=NULL; }

Unfortunately, this function cannot be used in practice, because calling 
it yields an error:


error: passing argument 1 of ‘freesetnull’ from incompatible pointer type
note: expected ‘void **’ but argument is of type ‘char **’
[or whatever pointer you want to free]


If I understand correctly from a Usenet discussion, the reason for this 
error is that the C standard allows pointers to have different sizes (so 
sizeof(void*) might be larger than sizeof(int*) or so).


To get rid of this error, I need to sacrifice type safety and clutter my 
code with explicit casts, e.g.


void freesetnull(void *ptradd)
{ free(*(void**)ptradd);
  *(void**)ptradd=NULL; }

But since I've never seen such an exotic architecture with different 
pointer sizes and am 100% certain that my application will never run on 
such an architecture, I feel the strong need to sit down and contribute 
a GCC patch that turns this error into a warning that can be disabled 
(on mainstream architectures where all pointers have the same size).


To me, that just looks like a remnant from the ancient past that hinders 
the future. On the other hand, my feeling tells me that this patch would 
not be accepted, that's why I'm asking for my chances in advance ;-)


Best regards and many thanks,
Elmar


Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-08 Thread Elmar Krieger

Hi David, hi Andrew,


One other thing I just thought of: GCC has a history of very smart
extensions to C that allow to write faster and more elegant code. If I
look at my code, there are mostly two sources of 'dirty hacks' left. One
that could be fixed easily is the 'void** pointer problem', that
clutters my code with nasty explicit type casts:

A simple example is the function freesetnull, that frees a pointer and
sets it NULL (ptradd is a pointer address):

void freesetnull(void **ptradd)
{ free(*ptradd);
*ptradd=NULL; }



Normally, good style encourages using static inline functions instead of
macros, but here I would go for a macro:

freesetnull(_p) do { free(_p); _p = NULL; } while (0)

Note the slight change in the semantics - you pass the pointer, not a
pointer to the pointer.

I expect this will also have the bonus of improving the speed of the
resulting program - and possibly even of the compilation.


Many thanks, but this was really just the most simple example to 
illustrate how GCC's prohibition to pass a 'pointer to any pointer' to a 
function expecting a void** causes more problems than it solves. In your 
case, you are forced to use a macro (with the usual side-effects 
problem) instead of a safe 'static inline' to circumvent the problem.


But my code is full of cases that go beyond what can reasonably be done 
with macros.


Just one more complicated example:

A function that loads a binary file from disk and allocates the required 
memory to store the file contents, returning the number of bytes read. 
dstadd is the address where the newly allocated pointer is stored:


int dsc_loadfilealloc(void *dstadd,char *filename)
{ int read,size;
  FILE *fb;

  if ((fb=fopen(filename,"rb")))
  { size=dsc_filesize(filename);
*(void**)dstadd=mem_alloc(size);
read=dsc_readbytes(*(void**)dstadd,fb,size);
*(void**)dstadd=mem_realloc(*(void**)dstadd,read);
fclose(fb);
return(read); }
  *(void**)dstadd=NULL;
  return(0); }

Again, nasty casts all over the place, which would all disappear if GCC 
allowed me to write


int dsc_loadfilealloc(void **dstadd,char *filename)

which could then be used to load anything from text files to an array of 
'struct foo'.



Regarding compile time, modern gcc has "-funit-at-a-time" and
"-ftop-level-reorder" enabled by default, even with no optimisation
enabled. These could take time on such a huge source code. Try compiling
with "-fno-unit-at-a-time" (which implies "-fno-toplevel-reorder").


Many thanks, tried that and the time at -O !increased! (I am writing 
this from a slower computer) from 6 minutes 38 to 7 minutes. BTW, the 
time with -O0 is just 17 seconds.


And to Andrew Haley:

>> To me, that just looks like a remnant from the ancient past that hinders
>> the future. On the other hand, my feeling tells me that this patch would
>> not be accepted, that's why I'm asking for my chances in advance ;-)
>
> Not at all high.  See Type-Based Alias Analysis
> <http://www.drdobbs.com/cpp/type-based-alias-analysis/184404273>
> for one reason.

Thanks, I read the article, but didn't really see how forbidding a 
function with argument void** to accept a pointer to any pointer helps 
with aliasing.


If it's perfectly normal that a function with argument void* accepts any 
pointer, then a function with argument void** should accept a pointer to 
any pointer by analogy, without having additional aliasing problems, no?


All the best,
Elmar

--
Elmar Krieger, PhD
YASARA Biosciences & CMBI Outstation Austria
Wagramer Strasse 25/3/45
1220 Vienna
Austria/Europe
www.YASARA.org


Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-10 Thread Elmar Krieger

Hi Ian, hi Richard, hi Andi!

Many thanks for your comments.

>>> The slowdown is not the same with other files, so I'm essentially sure
>>> that this specific source file has some 'feature' that catches GCC at
>>> the wrong leg. This raises my hopes that one of the GCC experts wants
>>> to take a look at it. The code is confidential,
>>
>> You could file a bug report with just a profile output of the compiler
>> (e.g. from oprofile or perf)
>
> But please use a pristine FSF compiler.  You can also run the source 
through

> some obfuscation tool.  Or get a first hint with using -ftime-report.
>
> In the end, without a testcase there is nothing to do for us ...

I downloaded the latest official GCC 4.7.1, but unfortunately configure 
stopped with "Building GCC requires GMP 4.2+, MPFR 2.3.1+ and MPC 
0.8.0+.", and for my CentOS Linux, only older versions of this libs are 
available as RPMs. I saw many hours of manual fiddling ahead, so I 
suggest a more efficient solution:


I now sent the confidential source file by private message to Richard, 
please spend 5 minutes to run these two commands with it:


time gcc -m32 -g -O0 -fno-strict-aliasing -x c -Wall -Werror -c model.i
time gcc -m32 -g -O -fno-strict-aliasing -x c -Wall -Werror -c model.i

If you don't find an enormous slowdown with the second command (please 
post your timings) and conclude that this problem has been introduced by 
Google in their custom GCC, I'll pay you 100 USD for the 5 minutes wasted.


To Ian:


Not at all high.  See Type-Based Alias Analysis

for one reason.


Thanks, I read the article, but didn't really see how forbidding a function
with argument void** to accept a pointer to any pointer helps with aliasing.

If it's perfectly normal that a function with argument void* accepts any
pointer, then a function with argument void** should accept a pointer to any
pointer by analogy, without having additional aliasing problems, no?


The C and C++ languages could work that way, yes.  But they don't.
GCC attempts to implement the standard language.


Yep, that's why I mentioned how GCC's smart extensions to the standard 
language saved the day many times in the past ;-)



Aliasing issues arise when a function has two pointers, and determine
whether an assignment to *p1 might change the value at *p2.  There are
no aliasing issues with a void* pointer, because if p1 is void* then
*p1 is invalid.  That is not true for a void** pointer, so aliasing
issues do arise.  If p1 is void** and p2 is int**, then GCC will
assume that an assignment to *p1 does not change the value at *p2, as
the language standard states.  It's easy to imagine that that could
break a program after inlining.


Many thanks for the clarification, and it also points to a simple solution:

GCC could simply permit to pass a pointer to any pointer to a function, 
if the function argument is of type 'void **restrict myptr'.


If adding a 'restrict' to a function declaration was the only thing 
required to get rid of countless nasty explicit type casts, the day 
would already be saved. There really seem to be lots of problem classes 
that cannot be solved with explicit type casts otherwise. The example 
for loading a binary file from disk and allocating the required memory 
to store the file contents being just one of them...


Best regards,
Elmar


> Just one more complicated example:
>
> A function that loads a binary file from disk and allocates the required
> memory to store the file contents, returning the number of bytes read.
> dstadd is the address where the newly allocated pointer is stored:
>
> int dsc_loadfilealloc(void *dstadd,char *filename)
> { int read,size;
>   FILE *fb;
>
>   if ((fb=fopen(filename,"rb")))
>   { size=dsc_filesize(filename);
> *(void**)dstadd=mem_alloc(size);
> read=dsc_readbytes(*(void**)dstadd,fb,size);
> *(void**)dstadd=mem_realloc(*(void**)dstadd,read);
> fclose(fb);
> return(read); }
>   *(void**)dstadd=NULL;
>   return(0); }
>
> Again, nasty casts all over the place, which would all disappear if GCC
> allowed me to write
>
> int dsc_loadfilealloc(void **dstadd,char *filename)
>
> which could then be used to load anything from text files to an array of
> 'struct foo'.
>




Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-13 Thread Elmar Krieger

Hi Richard,

many thanks for saving my time.


time gcc -m32 -g -O -fno-strict-aliasing -x c -Wall -Werror -c model.i


That's within reasonable bounds as well, IMHO (you can't really compare
-O1 from 3.2.3 with -O1 from 4.6.3).  One more data point (-O2 tends to
be more focused on, no debuginfo generation turns off improvements
and its costs there):

/usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o
/dev/null model.i -march=i386 -fno-strict-aliasing -w -O2
17.31user 0.43system 0:17.82elapsed 99%CPU (0avgtext+0avgdata
427392maxresident)k
72inputs+0outputs (2major+69895minor)pagefaults 0swaps

/usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386
-fno-strict-aliasing  -m32 -w -O2
18.12user 0.21system 0:18.43elapsed 99%CPU (0avgtext+0avgdata
1752784maxresident)k
0inputs+0outputs (0major+124029minor)pagefaults 0swaps

same time, I am surprised again ;) (with improvements in CPU speed the
compilation
with 4.6.3 is actually _faster_ comparing commodity platforms from the
date of the compiler releases).


> You might want to try -ftime-report, if it says you have extra 
checkings enabled
> for one compiler but not the other that will explain the different 
outcome at

> your side:
>

Good news, and especially the -ftime-report trick was highly useful.

For example, I got a huge slowdown also with this compiler:

gcc44 (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3)
Copyright (C) 2010 Free Software Foundation, Inc.

which spends all its time in 'variable tracking':


variable tracking : 126.07 (89%) usr   0.26 ( 7%) sys 126.50 (87%) 
wall   20647 kB ( 6%) ggc
 TOTAL : 141.94 3.66   145.61 
   336368 kB


real2m26.703s


And the Google Android compiler I reported originally...

i686-linux-android-gcc (GCC) 4.6.x-google 20120106 (prerelease)
Copyright (C) 2011 Free Software Foundation, Inc.

...which takes more than twice as long spends its time here:

phase cgraph  : 347.75 (100%) usr  10.73 (76%) sys 358.51 (99%) 
wall  130837 kB (84%) ggc
phase generate: 347.85 (100%) usr  10.77 (76%) sys 358.64 (99%) 
wall  132490 kB (85%) ggc
var-tracking dataflow : 284.34 (82%) usr   0.00 ( 0%) sys 284.21 (78%) 
wall   0 kB ( 0%) ggc
TOTAL : 350.0412.53   362.60 
 155292 kB


real6m3.567s

I really didn't expect that RedHat and Google both mess up GCC with 
their modifications, so I'll report it to them instead ;-)


Anyway, please send by private email your favorite way of receiving the 
promised 100 USD. Could be PayPal, a list of Amazon.com items which are 
sent to your address, a direct bank transfer etc..


Best regards,
Elmar


If you don't find an enormous slowdown with the second command (please post
your timings) and conclude that this problem has been introduced by Google
in their custom GCC, I'll pay you 100 USD for the 5 minutes wasted.


Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

Disclaimer: I had to delete an include statement on top of the file I sent you
to make it compile.

Richard.




Aliasing issues arise when a function has two pointers, and determine
whether an assignment to *p1 might change the value at *p2.  There are
no aliasing issues with a void* pointer, because if p1 is void* then
*p1 is invalid.  That is not true for a void** pointer, so aliasing
issues do arise.  If p1 is void** and p2 is int**, then GCC will
assume that an assignment to *p1 does not change the value at *p2, as
the language standard states.  It's easy to imagine that that could
break a program after inlining.



Many thanks for the clarification, and it also points to a simple solution:

GCC could simply permit to pass a pointer to any pointer to a function, if
the function argument is of type 'void **restrict myptr'.

If adding a 'restrict' to a function declaration was the only thing required
to get rid of countless nasty explicit type casts, the day would already be
saved. There really seem to be lots of problem classes that cannot be solved
with explicit type casts otherwise. The example for loading a binary file
from disk and allocating the required memory to store the file contents
being just one of them...

Best regards,
Elmar




Just one more complicated example:

A function that loads a binary file from disk and allocates the required
memory to store the file contents, returning the number of bytes read.
dstadd is the address where the newly allocated pointer is stored:

int dsc_loadfilealloc(void *dstadd,char *filename)
{ int read,size;
   FILE *fb;

   if ((fb=fopen(filename,"rb")))
   { size=dsc_filesize(filename);
 *(void**)dstadd=mem_alloc(size);
 read=dsc_readbytes(*(void**)dstadd,fb,size);
 *(void**)dstadd=mem_realloc(*(void**)dstadd,read);
 fclose(fb);
 return(read); }
   *(void**)dstadd=NULL;
   return(0); }

Again, nasty casts all over the place, which would all disappear if GCC
allowed me to wri