[Bug c/33102] New: volatile excessively suppresses optimizations in range checks
Source code: -- volatile int i; int j; int testme(void) { return i <= 1; } int testme2(void) { return j <= 1; } -- Compiler command line: "cc -S -O torvalds.c" -- Expected results: volatile accesses not moved past sequence points, optimization otherwise unaffected. -- Observed results: redundant move to register generated, unecessarily increasing register pressure, increasing the size of the binary, and potentially consuming more power and decreasing performance. -- Build date and platform: August 17 2007 Ubuntu Feisty Fawn. -- gcc -v output: Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu --enable-libstdcxx-debug --enable-mpfr --enable-checking=release i486-linux-gnu Thread model: posix gcc version 4.1.2 (Ubuntu 4.1.2-0ubuntu4) -- Code generated to compare volatile variable to constant: movli, %eax cmpl$1, %eax Code generated to compare non-volatile variable to constant: cmpl$1, j The latter code sequence should be generated for the volatile case as well as the non-volatile case. Similar inefficiencies are produced in response to other uses of volatile variables. -- Summary: volatile excessively suppresses optimizations in range checks Product: gcc Version: 4.1.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: paulmck at linux dot vnet dot ibm dot com GCC build triplet: i486-linux-gnu GCC host triplet: i486-linux-gnu GCC target triplet: i486-linux-gnu http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102
[Bug c/33102] volatile excessively suppresses optimizations in range checks
--- Comment #2 from paulmck at linux dot vnet dot ibm dot com 2007-08-18 00:11 --- Hmmm... I wasn't asking for volatile to be atomic, just for it to avoid generating unnecessary code. -- paulmck at linux dot vnet dot ibm dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|DUPLICATE | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102
[Bug c/33102] volatile excessively suppresses optimizations in range checks
--- Comment #6 from paulmck at linux dot vnet dot ibm dot com 2007-08-18 01:04 --- (In reply to comment #4) > It is still the same issue. Perhaps I am missing something, but I don't know of any hardware that would react differently to this two-instruction sequence: movli, %eax cmpl$1, %eax than it would to the following single instruction: cmpl$1, j Either way, there is a single memory reference, and for all hardware I know of, it looks the same to both memory and external devices. -- paulmck at linux dot vnet dot ibm dot com changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|DUPLICATE | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102
[Bug middle-end/33102] volatile excessively suppresses optimizations in range checks
--- Comment #11 from paulmck at linux dot vnet dot ibm dot com 2007-08-18 01:21 --- (In reply to comment #10) > Actually as I understand it, the expanded version is slightly faster under > newer x86's anyways as they don't have an extra decode stage. The main concern on the recent LKML thread appeared to be code size rather than speed. That said, if you are correct, it certainly wouldn't be the first time that multiple instructions turned out to be cheaper than a single instruction. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102
[Bug middle-end/33102] volatile excessively suppresses optimizations in range checks
--- Comment #12 from paulmck at linux dot vnet dot ibm dot com 2007-08-18 01:23 --- (In reply to comment #9) > s/debian/Ubuntu/ Please accept my apologies for skipping that step -- I wasn't aware of this. Should I replicate this bug at Ubuntu, or is this strictly advice for future bug submissions? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102
[Bug middle-end/33102] volatile excessively suppresses optimizations in range checks
--- Comment #15 from paulmck at linux dot vnet dot ibm dot com 2007-08-18 22:12 --- (In reply to comment #13) > (In reply to comment #11) > > The main concern on the recent LKML thread appeared to be code size rather > > than > > speed. > One should note this only helps CISC based processors, it will not help stuff > like PowerPC anyways. It is better to remove volatile in 95% of the places > where the kernel uses it anyways than fix this bug. I agree that this change won't help PowerPC. As you say, it is primarily helpful to CISC processors (x86, x86-64, mainframe, m68000, ...). Although there do appear to be places in the kernel where volatile is overused and abused, it would still be good to fix this bug. > (In reply to comment #12) > > Please accept my apologies for skipping that step -- I wasn't aware of > > this. > > Should I replicate this bug at Ubuntu, or is this strictly advice for future > > bug submissions? > > It would be better next time unless you can test it on a FSF GCC source > release/SVN. Thank you for the guidance! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102
[Bug middle-end/33102] volatile excessively suppresses optimizations in range checks
--- Comment #14 from paulmck at linux dot vnet dot ibm dot com 2007-08-18 22:08 --- (In reply to comment #7) > One should note this is actually hard to do without changing the code for 3506 > also. And of course if the volatile variable in the 3506 example code was an MMIO register, there would not be any atomicity, at least not given the hardware I have come across. And I am not aware of any devices where it would be useful to blindly increment an MMIO register. So I believe that this is a non-issue. Or am I missing something? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102