Re: Volatile qualification on pointer and data
What can't make sense is a /static/ "volatile const" which is /defined/ locally, rather than just declared. The code in question sounds well-defined (but probably poor style) to me. It is never OK to access a qualified object through an unqualified pointer, but my understanding is that accessing an unqualified object through a qualified pointer is well-defined and that the usual qualifier rules apply to that access. David, is your "can't make sense" backed up by a standard? There is no "lying to the compiler", there is only conforming and non-conforming code. John
Re: Volatile qualification on pointer and data
it. And while I think the compiler should be allowed to generate the optimised code of 4.6 (i.e., the change is not a bug IMHO), I fully understand the idea of generating the older, slower, but definitely correct code of 4.5. My understanding is that the standard mandates the old behavior, so 4.6 is in error. I am still trying to imagine a real-world use-case for declaring an object "static const" and later accessing it as "volatile". Yeah, it would seem far clearer to declare it as static const volatile in the first place. I've done a bunch of automated testing of GCC's implementation of volatile. None of that testing would have exposed this bug because we only count acceses to objects declared as volatile. I've come to the conclusion that "volatile" is a language design error. It complicates the compiler implementation and has confusing, underspecified semantics. If you want to force a load or store, an explicit function call is a clearer way to do it. John
detailed comparison of generated code size for GCC and other compilers
See here: http://embed.cs.utah.edu/embarrassing/ There is a lot of data there. Please excuse bugs and other problems. Feedback would be appreciated. John Regehr
Re: detailed comparison of generated code size for GCC and other compilers
I wonder if the original program was already broken or was this something your conversion introduced? Not sure about this specific case but I'm sure there's some of each. I also noticed these testcases but decided to leave them in for now. Obviously the code is useless, but it can still be interpreted according to the C standard, and code can be generated. Once you start going down the road of exploiting undefined behavior to create better code -- and gcc already does this pretty aggressively -- why not keep going? That said, if there's a clear sentiment that this kind of test case is undesirable, I'll make an effort to get rid of these for subsequent runs. The bottom line is that these results are supposed to provide you folks with useful directions for improvement. Looking further down the table a lot of the differences on empty-after-optimization functions (lots of 5 vs 2 bytes) seem to be that gcc-head uses frame pointers and the other compiler doesn't. Clearly for a fair comparison these settings should be the same. I wanted to avoid playing flag games and go with -Os (or nearest equivalent) for all compilers. Maybe that isn't right. John Regehr
Re: detailed comparison of generated code size for GCC and other compilers
Ok, thanks for the feedback Andi. Incidentally, the LLVM folks seem to agree with both of your suggestions. I'll re-run everything w/o frame pointers and ignoring testcases where some compiler warns about use of uninitialized local. I hate the way these warnings are not totally reliable, but realistically if GCC catches most cases (which it almost certainly will) the ones that slip past won't be too much of a problem. No doubt there are plenty more improvements to make but hopefully this is a good start. John Regehr
Re: detailed comparison of generated code size for GCC and other compilers
My opinion is that code containing undefined behaviors is definitely interesting, but probably it is interesting in a different way than functions that are more meaningful. If I have time I'll just separate out the testcases into two groups: one containing functions that are more or less sensible code, the other containing functions that can be automatically categorized as bogus. Thanks, John Regehr Actually, I think they're very interesting - especially if they are valid code, and one compiler optimizes them away, but the other doesn't. You may have heard of a commercial testsuite built on this principle :-) -- Daniel Jacobowitz CodeSourcery
Re: detailed comparison of generated code size for GCC and other compilers
Optimizations based on uninitialized variables make me very nervous. If uninitialized memory reads are transformed into don't-cares, then checking tools like valgrind will no longer see the UMR (assuming that the lack of initialization is a bug). Did I understand that icc does this? It seems like a dangerous practice. Yes, it looks like icc does this. But so does gcc, see below. There is no "add" in the generated code. John Regehr [reg...@babel ~]$ cat undef.c int foo (int x) { int y; return x+y; } [reg...@babel ~]$ current-gcc -O3 -S -o - undef.c -fomit-frame-pointer .file "undef.c" .text .p2align 4,,15 .globl foo .type foo, @function foo: movl4(%esp), %eax ret .size foo, .-foo .ident "GCC: (GNU) 4.5.0 20091117 (experimental)" .section.note.GNU-stack,"",@progbits
Re: detailed comparison of generated code size for GCC and other compilers
I would only be worried for cases where no warning is issued *and* unitialized accesses are eliminated. Yeah, it would be excellent if GCC maintained the invariant that for all uses of uninitialized storage, either the compiler or else valgrind will issue a warning. We could test for violations of this. Several times I've thought about cross-testing various compilers and versions of compilers for consistency of warnings. But I never managed to convince myself that developers would care enough to make it worth the trouble. John
Re: detailed comparison of generated code size for GCC and other compilers
Also, we're not running LTO in any compiler and we removed all "static" declarations from the code to keep compilers from making closed-world assumptions. John Regehr
updated code size comparison
[cross-posting to the GCC and LLVM lists] I've updated the code size results here: http://embed.cs.utah.edu/embarrassing/dec_09/ The changes for this run were: - delete a number of testcases that contained use of uninitialized local variables - turn off frame pointer emission for all compilers - ask all compilers to target x86 + SSE3 - ask all compilers to not emit stack protector code - run unix2dos on the .c files so people on Windows don't see all the lines running together Hopefully the results are more fair and useful now. Again, feedback is appreciated. Once people are happy with how these results are obtained, I'll plan on just re-running the scripts every few months so we can see how the compilers evolve. Also there are many possibilities for enhancement including adding new architectures, harvesting more and larger functions, and harvesting C++ code. Thanks, John Regehr
Re: updated code size comparison
Moreover, aggregating those boolean results to yield things like "X generated larger code than Y NN% of the time" seems even weirder. Is this really useful information, other than for marketing? Hi Miles- Did you click through to one of the pages that shows a rank-ordered list of functions where one compiler generates bigger code than another? Those are the pages that are supposed to contain useful information. You're right, the aggregated results are not useful other than to get a broad overview. John Regehr
Re: updated code size comparison
Hi Paolo, I would also avoid testcases using volatile. Smaller code on these testcases is often a sign of miscompilation rather than optimization. For example, http://embed.cs.utah.edu/embarrassing/src_harvested_dec_09/076389.c is miscompiled on GCC 3.4 and SunCC 5.10. Yeah, there are definitely several examples where small code is generated by miscompilation, especially of volatiles. However I would prefer to leave these testcases in, unless there is a strong feeling that they are too distracting. They serve as poignant little reminders about how easy it is to get volatile wrong... John
Re: updated code size comparison
Yes, that was my point. If you want to make a separate section for volatile, that would indeed be helpful. I checked and there are about 37,000 harvested functions containing the volatile qualifier. Next time, there will be even more since we'll be harvesting code from the FreeBSD kernel in addition to Linux. It doesn't seem at all clear that it's productive to separate these out. If people are really hating volatile and think it leads to unfair results, I'll probably just #define away volatile next time. John
updated code size comparison
Hi folks, I've posted an updated code size comparison between LLVM, GCC, and others here: http://embed.cs.utah.edu/embarrassing/ New in this version: - much larger collection of harvested functions: more than 360,000 - bug fixes and UI improvements - added the x86 Open64 compiler John
expression statements, volatiles, and C vs. C++
The question is, what should C and C++ compilers do with this code? volatile int x; void foo (void) { x; } This question is not totally stupid: embedded systems use code like this when reading a hardware register has a useful side effect (usually clearing the register). It is reasonably clear that a C compiler should load from x and throw away the value. gcc does this, as do most decent C compilers. However, g++ also loads from x and this does not appear to be supported by the 1998 C++ standard. In 6.2, it is explicitly stated that for an expression statement, no conversion from lvalue to rvalue occurs. If there's no rvalue, there should not be a load from x. Anyway, I'm curious: is the C-like interpretation of a volatile expression statement considered to be a feature by the g++ maintainers? If so, what is the rationale? I haven't do extensive testing, but there do exist compiler families (such as those from IAR and Intel) where the C compiler loads from x and the C++ compiler does not. Thanks, John Regehr
Re: expression statements, volatiles, and C vs. C++
I'm not sure this follows. It's stated explicitly that "The expression is evaluated and its value is discarded." How can you evaluate the expression without reading the volatle? I'm certainly not an expert on this material but I wouldn't think you'd normally read a variable in order to evaluate it as an lvalue. My guess is that evaluating the expression means computing the address for something like "*(p+1)". Also, there's the non-normative [Note: volatile is a hint to the implementation to avoid aggressive optimization involving the object because the value of the object might be changed by means undetectable by an implementation. See 1.9 for detailed semantics. In general, the semantics of volatile are intended to be the same in C++ as they are in C. ] Yep. Besides, all that changing this would do is break programs. Well they're broken anyway since widely-used C++ compilers are choosing to not read the variable under these circumstances. The obvious solution is "don't do that" but there's plenty of code doing this out there, and plenty of developers who need re-educating. John
Re: register int variable being written to/read from stack
First, off the volatile qualifier means "source-level reads/write must turn into real loads and stores" so take that out of your declaration and see how the code looks then. I don't know if the standard addresses the interaction of volatile and register, but I'd expect volatile to win since register doesn't actually mean anything (in the standard, at least). John Regehr On Tue, 20 Jan 2009, Ian Lance Taylor wrote: baver writes: A sample code listing is at the bottom of the email, as well as the lines we've added to opcodes/mips-opc.c for our opcodes. Anyone know how to stop the register from being stored and read from on the stack? We've defined it as volatile register int idx asm("s0"); Show us the change you made to the MD file, or, if you didn't make a change, show us the asm statement you are using. Anyone know how to stop the register from being stored and read from on the stack? We've defined it as volatile register int idx asm("s0"); Are you doing this inside or outside of a function? That is, a global variable or not? The two cases act quite differently, as discussed briefly in the documentation. Ian
Re: Split Stacks proposal
This effort is relevant: http://research.microsoft.com/en-us/um/people/jcondit/capriccio-sosp-2003.pdf John Regehr
Re: bitwise dataflow
I'm thinking about adding bitwise dataflow analysis support to RTL. Before embarking on this, I'd suggest playing with the bitwise domain analysis that one of my students did as part of his cXprop tool: http://www.cs.utah.edu/~coop/research/cxprop/#DOWNLOADS This is a source-level analysis in CIL and so is not quite analogous to what you propose. However, it should give you some ideas about what kind of results you can expect at the RTL level. Our experience was that the bitwise domain is not that powerful. On the other hand, it converges quickly compared to intervals. John Regehr
some integer undefined behaviors in gcc
I ran gcc 162830 on x86 under a tool that checks for integer undefined behaviors. The attached error messages show up when running "make check" and when recompiling gcc. Each line in the attachment is an error message giving the problematic operator, its srcloc, the types of its operands, and examples of offending values. Let me know if more detail is needed or if it would be better for me to file all 71 bug reports. Thanks, John Regehr <../../gcc/alias.c, (1896:25)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): -2147483647 right (int32): -4 <../../gcc/alias.c, (322:44)> : Op: *, Reason : Signed Multiplication Overflow, BINARY OPERATION: left (int32): 39996 right (int32): 8 <../../gcc/builtins.c, (7681:57)> : Op: -, Reason : Signed Subtraction Overflow, UNARY OPERATION: right (int32): -2147483648 <../../gcc/builtins.c, (7699:57)> : Op: -, Reason : Signed Subtraction Overflow, UNARY OPERATION: right (int32): -2147483648 <../../gcc/builtins.c, (7709:25)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/builtins.c, (7717:25)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/combine.c, (10620:62)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/combine.c, (10655:62)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/combine.c, (11350:54)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/combine.c, (7047:63)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/combine.c, (7205:54)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): 2147483647 right (int32): 1 <../../gcc/combine.c, (7838:22)> : Op: <<, Reason : Unsigned Left Shift Error: Right operand is negative or is greater than or equal to the width of the promoted left operand, BINARY OPERATION: left (uint32): 4294967295 right (uint32): 4294967291 <../../gcc/config/i386/i386.c, (10253:10)> : Reason : The current index is greater than array size! <../../gcc/config/i386/i386.c, (16316:17)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): 0 right (int32): -2147483648 <../../gcc/config/i386/i386.c, (16362:18)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): 0 right (int32): -2147483648 <../../gcc/config/i386/i386.c, (16473:11)> : Op: -, Reason : Signed Subtraction Overflow, UNARY OPERATION: right (int32): -2147483648 <../../gcc/dbxout.c, (674:14)> : Op: -, Reason : Signed Subtraction Overflow, UNARY OPERATION: right (int32): -2147483648 <../../gcc/double-int.c, (115:13)> : Op: -, Reason : Signed Subtraction Overflow, UNARY OPERATION: right (int32): -2147483648 <../../gcc/double-int.c, (158:21)> : Op: *, Reason : Signed Multiplication Overflow, BINARY OPERATION: left (int32): 65535 right (int32): 65535 <../../gcc/dse.c, (1636:28)> : Op: +, Reason : Signed Addition Overflow, BINARY OPERATION: left (int32): 2147483647 right (int32): 1 <../../gcc/dwarf2out.c, (4753:46)> : Op: -, Reason : Signed Subtraction Overflow, UNARY OPERATION: right (int32): -2147483648 <../../gcc/emit-rtl.c, (261:44)> : Op: *, Reason : Signed Multiplication Overflow, BINARY OPERATION: left (int32): 134217728 right (int32): 5 <../../gcc/emit-rtl.c, (262:40)> : Op: *, Reason : Signed Multiplication Overflow, BINARY OPERATION: left (int32): 786432 right (int32): 250 <../../gcc/expmed.c, (1092:13)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/expmed.c, (2928:15)> : Op: -=, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/expmed.c, (3107:8)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/expmed.c, (3707:52)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/expmed.c, (3813:23)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/expmed.c, (4151:12)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/expmed.c, (949:38)> : Op: -, Reason : Signed Subtraction Overflow, BINARY OPERATION: left (int32): -2147483648 right (int32): 1 <../../gcc/expmed.c, (954:42)> : Op: -, Reason : Signed Subtractio
Re: some integer undefined behaviors in gcc
I think the messages are clear enough. You should probably wait a few days to let people comment and/or fix, and then file PRs. 1 per file seems to be the right granularity. Thanks Eric, that's what I'll do. John
Re: some integer undefined behaviors in gcc
On Sat, 7 Aug 2010, Florian Weimer wrote: I wonder if we should give up and make -fwrapv the default. My sense is that there are not that many of these integer bugs, and probably all of them are simple to fix. Best to just fix them and then run a tool like ours every now and then to see if anything new has popped up. John Regehr
announcing C-Reduce, a test-case reducer for C/C++ programs
http://blog.regehr.org/archives/697 John