Re: Performance degradation on g++ 4.6

2011-08-24 Thread Oleg Smolsky
Sure. I've just attached it to the bug. On 2011/8/24 14:56, Xinliang David Li wrote: Thanks. Can you make the test case a standalone preprocessed file (using -E)? David On Wed, Aug 24, 2011 at 2:26 PM, Oleg Smolsky wrote: On 2011/8/24 13:02, Xinliang David Li wrote: On 2011/8/23 11:38, Xin

Re: Performance degradation on g++ 4.6

2011-08-24 Thread Xinliang David Li
Thanks. Can you make the test case a standalone preprocessed file (using -E)? David On Wed, Aug 24, 2011 at 2:26 PM, Oleg Smolsky wrote: > On 2011/8/24 13:02, Xinliang David Li wrote: >>> >>> On 2011/8/23 11:38, Xinliang David Li wrote: Partial register stall happens when there is a 3

Re: Performance degradation on g++ 4.6

2011-08-24 Thread Oleg Smolsky
On 2011/8/24 13:02, Xinliang David Li wrote: On 2011/8/23 11:38, Xinliang David Li wrote: Partial register stall happens when there is a 32bit register read followed by a partial register write. In your case, the stall probably happens in the next iteration when 'add eax, 0Ah' executes, so your

Re: Performance degradation on g++ 4.6

2011-08-24 Thread Xinliang David Li
On Wed, Aug 24, 2011 at 12:50 PM, Oleg Smolsky wrote: > On 2011/8/23 11:38, Xinliang David Li wrote: >> >> Partial register stall happens when there is a 32bit register read >> followed by a partial register write. In your case, the stall probably >> happens in the next iteration when 'add eax, 0A

Re: Performance degradation on g++ 4.6

2011-08-24 Thread Oleg Smolsky
On 2011/8/23 11:38, Xinliang David Li wrote: Partial register stall happens when there is a 32bit register read followed by a partial register write. In your case, the stall probably happens in the next iteration when 'add eax, 0Ah' executes, so your manual patch does not work. Try change add a

Re: Performance degradation on g++ 4.6

2011-08-23 Thread Xinliang David Li
Partial register stall happens when there is a 32bit register read followed by a partial register write. In your case, the stall probably happens in the next iteration when 'add eax, 0Ah' executes, so your manual patch does not work. Try change add al, [dx] into two instructions (assuming esi is

Re: Performance degradation on g++ 4.6

2011-08-23 Thread Oleg Smolsky
Hey Andrew, On 2011/8/22 18:37, Andrew Pinski wrote: On Mon, Aug 22, 2011 at 6:34 PM, Oleg Smolsky wrote: On 2011/8/22 18:09, Oleg Smolsky wrote: Both compilers fully inline the templated function and the emitted code looks very similar. I am puzzled as to why one of these loops is significan

Re: Performance degradation on g++ 4.6

2011-08-22 Thread Andrew Pinski
On Mon, Aug 22, 2011 at 6:34 PM, Oleg Smolsky wrote: > On 2011/8/22 18:09, Oleg Smolsky wrote: >> >> Both compilers fully inline the templated function and the emitted code >> looks very similar. I am puzzled as to why one of these loops is >> significantly slower than the other. I've attached dis

Re: Performance degradation on g++ 4.6

2011-08-22 Thread Oleg Smolsky
On 2011/8/22 18:09, Oleg Smolsky wrote: Both compilers fully inline the templated function and the emitted code looks very similar. I am puzzled as to why one of these loops is significantly slower than the other. I've attached disassembled listings - perhaps someone could have a look please? (

Re: Performance degradation on g++ 4.6

2011-08-22 Thread Oleg Smolsky
Hey David, these two --param options made no difference to the test. I've cut the suite down to a single test (attached), which yields the following results: ./simple_types_constant_folding_os (gcc 41) test description time operations/s 0 "int8_t constant add" 1.34 sec

Re: Performance degradation on g++ 4.6

2011-08-03 Thread Xinliang David Li
Scanning through the profile data you provided -- test functions such as test_constant ...> completely disappeared in 4.1's profile which means they are inlined by gcc4.1. They exist in 4.6's profile. For the unsigned short case where neither version inlines the call, 4.6 version is much faster. D

Re: Performance degradation on g++ 4.6

2011-08-02 Thread Richard Guenther
On Mon, Aug 1, 2011 at 8:43 PM, Oleg Smolsky wrote: > On 2011/7/29 14:07, Xinliang David Li wrote: >> >> Profiling tools are your best friend here. If you don't have access to >> any, the least you can do is to build the program with -pg option and >> use gprof tool to find out differences. > > Th

Re: Performance degradation on g++ 4.6

2011-08-01 Thread Xinliang David Li
Try isolate the int8_t constant folding testing from the rest to see if the slow down can be reproduced with the isolated case. If the problem disappear, it is likely due to the following inline parameters: large-function-insns, large-function-growth, large-unit-insns, inline-unit-growth. For inst

Re: Performance degradation on g++ 4.6

2011-08-01 Thread Marc Glisse
On Mon, 1 Aug 2011, Oleg Smolsky wrote: BTW, some of these tweaks increase the binary size to 99K, yet there is no performance increase. I don't see this in the thread: did you use -march=native? -- Marc Glisse

Re: Performance degradation on g++ 4.6

2011-08-01 Thread Oleg Smolsky
Hi Benjamin, On 2011/7/30 06:22, Benjamin Redelings I wrote: I had some performance degradation with 4.6 as well. However, I was able to cure it by using -finline-limit=800 or 1000 I think. However, this lead to a code size increase. Were the old higher-performance binaries larger? Yes, th

Re: Performance degradation on g++ 4.6

2011-07-30 Thread Benjamin Redelings I
Hi Oleg, I had some performance degradation with 4.6 as well. However, I was able to cure it by using -finline-limit=800 or 1000 I think. However, this lead to a code size increase. Were the old higher-performance binaries larger? IIRC, setting finline-limit=n actually sets two params to n

Re: Performance degradation on g++ 4.6

2011-07-30 Thread Richard Guenther
On Fri, Jul 29, 2011 at 7:56 PM, Oleg Smolsky wrote: > Hi there, I have compiled and run a set of C++ benchmarks on a CentOS4/64 > box using the following compilers: >    a) g++4.1 that is available for this distro (GCC version 4.1.2 20071124 > (Red Hat 4.1.2-42) >    b) g++4.6 that I built (stock

Re: Performance degradation on g++ 4.6

2011-07-29 Thread Xinliang David Li
On Fri, Jul 29, 2011 at 11:57 AM, Oleg Smolsky wrote: > Hey David, here are a couple of answers and notes: >    - I built the test suite with -O3 and cannot see anything else related to > inlining that isn't already ON (except for -finline-limit=n which I do not > how to use) size estimation, inl

Re: Performance degradation on g++ 4.6

2011-07-29 Thread Oleg Smolsky
Hey David, here are a couple of answers and notes: - I built the test suite with -O3 and cannot see anything else related to inlining that isn't already ON (except for -finline-limit=n which I do not how to use) http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html - FTO looks lik

Re: Performance degradation on g++ 4.6

2011-07-29 Thread Xinliang David Li
My guess is inlining differences. Try more aggressive inline parameters to see if helps. Also try FDO to see there is any performance difference between two versions. You will probably need to do first level triage and file bug reports. David On Fri, Jul 29, 2011 at 10:56 AM, Oleg Smolsky wrote