[Bug c/33102] New: volatile excessively suppresses optimizations in range checks

2007-08-17 Thread paulmck at linux dot vnet dot ibm dot com
Source code:
--
volatile int i;
int j;

int testme(void)
{
return i <= 1;
}

int testme2(void)
{
return j <= 1;
}
--
Compiler command line: "cc -S -O torvalds.c"
--
Expected results: volatile accesses not moved past sequence points,
optimization otherwise unaffected.
--
Observed results: redundant move to register generated, unecessarily increasing
register pressure, increasing the size of the binary, and potentially consuming
more power and decreasing performance.
--
Build date and platform: August 17 2007 Ubuntu Feisty Fawn.
--
gcc -v output:
Using built-in specs.
Target: i486-linux-gnu
Configured with: ../src/configure -v
--enable-languages=c,c++,fortran,objc,obj-c++,treelang --prefix=/usr
--enable-shared --with-system-zlib --libexecdir=/usr/lib
--without-included-gettext --enable-threads=posix --enable-nls
--program-suffix=-4.1 --enable-__cxa_atexit --enable-clocale=gnu
--enable-libstdcxx-debug --enable-mpfr --enable-checking=release i486-linux-gnu
Thread model: posix
gcc version 4.1.2 (Ubuntu 4.1.2-0ubuntu4)
--
Code generated to compare volatile variable to constant:
movli, %eax
cmpl$1, %eax
Code generated to compare non-volatile variable to constant:
cmpl$1, j
The latter code sequence should be generated for the volatile case as well as
the non-volatile case.  Similar inefficiencies are produced in response to
other uses of volatile variables.


-- 
   Summary: volatile excessively suppresses optimizations in range
checks
   Product: gcc
   Version: 4.1.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: paulmck at linux dot vnet dot ibm dot com
 GCC build triplet: i486-linux-gnu
  GCC host triplet: i486-linux-gnu
GCC target triplet: i486-linux-gnu


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102



[Bug c/33102] volatile excessively suppresses optimizations in range checks

2007-08-17 Thread paulmck at linux dot vnet dot ibm dot com


--- Comment #2 from paulmck at linux dot vnet dot ibm dot com  2007-08-18 
00:11 ---
Hmmm...  I wasn't asking for volatile to be atomic, just for it to avoid
generating unnecessary code.


-- 

paulmck at linux dot vnet dot ibm dot com changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|DUPLICATE   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102



[Bug c/33102] volatile excessively suppresses optimizations in range checks

2007-08-17 Thread paulmck at linux dot vnet dot ibm dot com


--- Comment #6 from paulmck at linux dot vnet dot ibm dot com  2007-08-18 
01:04 ---
(In reply to comment #4)
> It is still the same issue.

Perhaps I am missing something, but I don't know of any hardware that would
react differently to this two-instruction sequence:

movli, %eax
cmpl$1, %eax

than it would to the following single instruction:

cmpl$1, j

Either way, there is a single memory reference, and for all hardware I know of,
it looks the same to both memory and external devices.


-- 

paulmck at linux dot vnet dot ibm dot com changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|DUPLICATE   |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102



[Bug middle-end/33102] volatile excessively suppresses optimizations in range checks

2007-08-17 Thread paulmck at linux dot vnet dot ibm dot com


--- Comment #11 from paulmck at linux dot vnet dot ibm dot com  2007-08-18 
01:21 ---
(In reply to comment #10)
> Actually as I understand it, the expanded version is slightly faster under
> newer x86's anyways as they don't have an extra decode stage.

The main concern on the recent LKML thread appeared to be code size rather than
speed.

That said, if you are correct, it certainly wouldn't be the first time that
multiple instructions turned out to be cheaper than a single instruction.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102



[Bug middle-end/33102] volatile excessively suppresses optimizations in range checks

2007-08-17 Thread paulmck at linux dot vnet dot ibm dot com


--- Comment #12 from paulmck at linux dot vnet dot ibm dot com  2007-08-18 
01:23 ---
(In reply to comment #9)
> s/debian/Ubuntu/

Please accept my apologies for skipping that step -- I wasn't aware of this. 
Should I replicate this bug at Ubuntu, or is this strictly advice for future
bug submissions?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102



[Bug middle-end/33102] volatile excessively suppresses optimizations in range checks

2007-08-18 Thread paulmck at linux dot vnet dot ibm dot com


--- Comment #15 from paulmck at linux dot vnet dot ibm dot com  2007-08-18 
22:12 ---
(In reply to comment #13)
> (In reply to comment #11)
> > The main concern on the recent LKML thread appeared to be code size rather 
> > than
> > speed.
> One should note this only helps CISC based processors, it will not help stuff
> like PowerPC anyways.  It is better to remove volatile in 95% of the places
> where the kernel uses it anyways than fix this bug.

I agree that this change won't help PowerPC.  As you say, it is primarily
helpful to CISC processors (x86, x86-64, mainframe, m68000, ...).  Although
there do appear to be places in the kernel where volatile is overused and
abused, it would still be good to fix this bug.

> (In reply to comment #12)
> > Please accept my apologies for skipping that step -- I wasn't aware of 
> > this. 
> > Should I replicate this bug at Ubuntu, or is this strictly advice for future
> > bug submissions?
> 
> It would be better next time unless you can test it on a FSF GCC source
> release/SVN.

Thank you for the guidance!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102



[Bug middle-end/33102] volatile excessively suppresses optimizations in range checks

2007-08-18 Thread paulmck at linux dot vnet dot ibm dot com


--- Comment #14 from paulmck at linux dot vnet dot ibm dot com  2007-08-18 
22:08 ---
(In reply to comment #7)
> One should note this is actually hard to do without changing the code for 3506
> also.

And of course if the volatile variable in the 3506 example code was an MMIO
register, there would not be any atomicity, at least not given the hardware I
have come across.  And I am not aware of any devices where it would be useful
to blindly increment an MMIO register.

So I believe that this is a non-issue.  Or am I missing something?


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33102