Scanning through the profile data you provided -- test functions such
as test_constant<unsigned short, custom_constant_mutiply<short> ...>
completely disappeared in 4.1's profile which means they are inlined
by gcc4.1. They exist in 4.6's profile. For the unsigned short case
where neither version inlines the call, 4.6 version is much faster.

David

On Mon, Aug 1, 2011 at 11:43 AM, Oleg Smolsky <oleg.smol...@riverbed.com> wrote:
> On 2011/7/29 14:07, Xinliang David Li wrote:
>>
>> Profiling tools are your best friend here. If you don't have access to
>> any, the least you can do is to build the program with -pg option and
>> use gprof tool to find out differences.
>
> The test suite has a bunch of very basic C++ tests that are executed an
> enormous number of times. I've built one with the obvious performance
> degradation and attached the source, output and reports.
>
> Here are some highlights:
>    v4.1:    Total absolute time for int8_t constant folding: 30.42 sec
>    v4.6:    Total absolute time for int8_t constant folding: 43.32 sec
>
> Every one of the tests in this section had degraded... the first half more
> than the second. I am not sure how much further I can take this - the
> benchmarked code is very short and plain. I can post disassembly for one
> (some?) of them if anyone is willing to take a look...
>
> Thanks,
> Oleg.
>

Reply via email to