gcc miscompiling duff's device (probaby two different bugs)
Hi guys, I have the following variation on Duff's device that seems to mis-compile on all GCC versions I can access within a minute (that is gcc-3.{3,4}, gcc-4.{1,2,3,4} on x86 and gcc-4.3.2 on x86_64). The symptoms are as follows: $ gcc-4.4 -o duffbug duffbug.c ; ./duffbug { he��3) { hello world ! } As you can observe in the difference between duff4_works() and duff4_fails(), apparently due to for-loop initializer being externalized vs. specified as the first for-loop expression. It doesn't matter if the 'case 0' is labeling the for-loop, or the first statement in the for-loop in case of duff4_works() of course. However, older gcc-3.x do give a warning though if the 'case 0' labels the first statement for duff4_fails(), since the first expression in the for-loop is then inaccessible. All gcc-4.x versions don't warn, even when supplied with the -Wall flag (which is wrong, hence this *first* bug): $ gcc-4.4 -Wall -o duffbug duffbug.c ; ./duffbug $ gcc-3.4 -Wall -o duffbug duffbug.c ; ./duffbug duffbug.c: In function `duff4_fails': duffbug.c:28: warning: unreachable code at beginning of switch statement I think the compiler is generating wrong code for duff4_fails() when 'case 0' labels the for-loop. It somehow skips the first for-loop expression, just as if 'case 0' pointed to the first statement in the for-loop (hence this *second* bug). Haven't checked the assembly though... Kind regards, Pjotr Kourzanov #include int duff4_works(char * dst,const char * src,const size_t n) { const size_t rem=n % 4, a=rem + (!rem)*4; char * d=dst+=a; const char * s=src+=a; dst+=n; switch (rem) { case 0: for(/* gcc bug? dst+=n*/;d
Re: gcc miscompiling duff's device (probaby two different bugs)
On Tue, 2010-03-02 at 12:26 +0100, Richard Guenther wrote: > On Tue, Mar 2, 2010 at 12:00 PM, Pjotr Kourzanov > wrote: > > On Tue, 2010-03-02 at 10:47 +, Andrew Haley wrote: > >> On 03/02/2010 10:34 AM, Pjotr Kourzanov wrote: > >> > >> >> int duff4_fails(char * dst,const char * src,const size_t n) > >> >> { > >> >> const size_t rem=n % 4, a=rem + (!rem)*4; > >> >> char * d=dst+=a; > >> >> const char * s=src+=a; > >> >> /* gcc bug? dst+=n; */ > >> >> > >> >> switch (rem) { > >> >> case 0: for(dst+=n;d >> >> /*case 0:*/d[-4]=s[-4]; > >> >> case 3:d[-3]=s[-3]; > >> >> case 2:d[-2]=s[-2]; > >> >> case 1:d[-1]=s[-1]; > >> >> } > >> >> } > >> >>return 0; > >> >> } > >> >> The first time around the loop the initializer (d+=n) is jumped around, > >> >> so > >> >> d == dst. At the end of the loop, d+=4, so d > dst. Therefore the loop > >> >> exits. > >> > > >> > And its wrong since it shouldn't jump around the initializer. > >> > >> Sure it should. On entry to that loop, rem == 3. > > > > I agree, this is one of the places where referential transparency > > breaks in C. I wouldn't have expected that the compiler could or > > would put the first expression before the switch in this case: > > > > switch (rem) { > > for(dst+=n;d > case 0: d[-4]=s[-4]; ... > > }} > > > > However, the warning is still due, since a combination of a switch with a > > for loop results in code that is completely ignored, i.e., is inaccessible. > > As I said, gcc-3.x used to issue a warning for this one... > > Neither 2.95 nor 3.3.6 or 3.4.6 warn for me. That's weird. I am having: gcc-3.3 (GCC) 3.3.6 (Ubuntu 1:3.3.6-15ubuntu6) Configured with: ../src/configure -v --enable-languages=c,c++ --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug i486-linux-gnu Thread model: posix gcc-3.4 (GCC) 3.4.6 (Ubuntu 3.4.6-8ubuntu2) Configured with: ../src/configure -v --enable-languages=c,f77 --prefix=/usr --libexecdir=/usr/lib --with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --program-suffix=-3.4 --enable-__cxa_atexit --with-tune=pentium4 i486-linux-gnu Thread model: posix Are you sure you test the following variant? switch (rem) { for(dst+=n;d > Richard. > > >> > >> Andrew. > >> > > > > > > >
Re: gcc miscompiling duff's device (probaby two different bugs)
Dear all, Although I probably shouldn't have been so harsh calling this "mis-compiling", do you see any chance of back-porting this warning back into the mainline? P.S. The rationale of this exercise is of course that the "switch", being a goto in disguise needs careful attention, just like the goto. So combining it a-la Tom Duff with another construct blessed by Dijkstra, the for loop, can sometimes lead to unexpected results. Pjotr On Tue, 2010-03-02 at 14:07 +0100, Peter Kourzanov wrote: > On Tue, 2010-03-02 at 12:26 +0100, Richard Guenther wrote: > > On Tue, Mar 2, 2010 at 12:00 PM, Pjotr Kourzanov > > wrote: > > > On Tue, 2010-03-02 at 10:47 +, Andrew Haley wrote: > > >> On 03/02/2010 10:34 AM, Pjotr Kourzanov wrote: > > >> > > >> >> int duff4_fails(char * dst,const char * src,const size_t n) > > >> >> { > > >> >> const size_t rem=n % 4, a=rem + (!rem)*4; > > >> >> char * d=dst+=a; > > >> >> const char * s=src+=a; > > >> >> /* gcc bug? dst+=n; */ > > >> >> > > >> >> switch (rem) { > > >> >> case 0: for(dst+=n;d > >> >> /*case 0:*/d[-4]=s[-4]; > > >> >> case 3:d[-3]=s[-3]; > > >> >> case 2:d[-2]=s[-2]; > > >> >> case 1:d[-1]=s[-1]; > > >> >> } > > >> >> } > > >> >>return 0; > > >> >> } > > >> >> The first time around the loop the initializer (d+=n) is jumped > > >> >> around, so > > >> >> d == dst. At the end of the loop, d+=4, so d > dst. Therefore the > > >> >> loop > > >> >> exits. > > >> > > > >> > And its wrong since it shouldn't jump around the initializer. > > >> > > >> Sure it should. On entry to that loop, rem == 3. > > > > > > I agree, this is one of the places where referential transparency > > > breaks in C. I wouldn't have expected that the compiler could or > > > would put the first expression before the switch in this case: > > > > > > switch (rem) { > > > for(dst+=n;d > > case 0: d[-4]=s[-4]; ... > > > }} > > > > > > However, the warning is still due, since a combination of a switch with a > > > for loop results in code that is completely ignored, i.e., is > > > inaccessible. > > > As I said, gcc-3.x used to issue a warning for this one... > > > > Neither 2.95 nor 3.3.6 or 3.4.6 warn for me. > > That's weird. I am having: > > gcc-3.3 (GCC) 3.3.6 (Ubuntu 1:3.3.6-15ubuntu6) > Configured with: ../src/configure -v --enable-languages=c,c++ > --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info > --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared > --enable-__cxa_atexit --with-system-zlib --enable-nls > --without-included-gettext --enable-clocale=gnu --enable-debug > i486-linux-gnu > Thread model: posix > gcc-3.4 (GCC) 3.4.6 (Ubuntu 3.4.6-8ubuntu2) > Configured with: ../src/configure -v --enable-languages=c,f77 > --prefix=/usr --libexecdir=/usr/lib > --with-gxx-include-dir=/usr/include/c++/3.4 --enable-shared > --with-system-zlib --enable-nls --without-included-gettext > --program-suffix=-3.4 --enable-__cxa_atexit --with-tune=pentium4 > i486-linux-gnu > Thread model: posix > > Are you sure you test the following variant? > > switch (rem) { > for(dst+=n;d case 0: d[-4]=s[-4]; ... > }} > > > > > Richard. > > > > >> > > >> Andrew. > > >> > > > > > > > > > > > > >
sizeof(array) with variable-length array parameter
Dear gcc users and developers, This might be a stupid question, nevertheless... I've been wondering for a long time, why the behaviour of variable-length arrays w.r.t. the sizeof operator is different for local/auto variables and for function arguments (in C99): #include void foo(int s, int a[s]) { printf("%u\n",sizeof a); } int main() { int s=10,a[s]; printf("%u\n",sizeof a); foo(sizeof a/sizeof a[0],a); } The printf's produce very different results: the first one returns "40" the other one returns 4, implying that the compiler forgets that the size of the array is actually "s", not the size of the pointer argument. Is it so difficult to make "sizeof a" return "s" in both cases? The info page on gcc-4.3 does refer to variable-length arrays in the context of the function declaration: "The length of an array is computed once when the storage is allocated and is remembered for the scope of the array in case you access it with `sizeof'." This behaviour is fortunately consistent with all versions of gcc that I actively use (2.95-4.3), so this might be a C99 feature that is not implemented correctly... Regards, Pjotr Kourzanov
Re: sizeof(array) with variable-length array parameter
On Wed, Apr 09, 2008 at 01:22:15PM +0100, Andrew Haley wrote: > [EMAIL PROTECTED] wrote: > > Dear gcc users and developers, > > > > This might be a stupid question, nevertheless... > > > > I've been wondering for a long time, why the behaviour of > > variable-length arrays w.r.t. the sizeof operator is different > > for local/auto variables and for function arguments (in C99): > > > > #include > > void foo(int s, int a[s]) { > > printf("%u\n",sizeof a); > > } > > int main() > > { > > int s=10,a[s]; > > printf("%u\n",sizeof a); > > foo(sizeof a/sizeof a[0],a); > > } > > > > The printf's produce very different results: the first one > > returns "40" the other one returns 4, implying that the compiler > > forgets that the size of the array is actually "s", not the size > > of the pointer argument. Is it so difficult to make "sizeof a" > > return "s" in both cases? > > That's C for you, I'm afraid: arrays always decay to pointers to the > first element when passed as arguments. The size of a VLA is not passed. Well, it sort of is: it is actually passed as a parameter to the function - look at the C99 defining a syntax for VLA parameters. The fact the the compiler forgets about it could be compensated by this... What I would like to know, is why the standard (ISO/IEC 9899) does explicitly require sizeof to return size of the pointer rather than more obvious response? Is it backwards compatibility? Is there any use in this behaviour besides making the life of the compiler development team easy? Surely, such an extension should be not too difficult to implement, provided that the compiler can account for (potential) modifications to the size parameter and require sizeof the return the value that was actually passed to the function, which BTW, it already does, since this behaviour is easily verified to work well for variable VLAs. In fact, strictly following C99 standard would force one that is not familiar with the use of const on integer parameters to use the "s" parameter to retrieve the size, which could lead to subtle bugs if the parameter is modified on the way: void foo(int s, int a[s]) { s=20; printf("%u\n",s); } Also, making sure that there are no out-of-bounds accesses (mudflap) is simply impossible if the array size is not known... Just curious... Pjotr > > Andrew. >