Re: Libav benchmarks

2012-07-16 Thread Mans Rullgard
On 10 July 2012 14:57, Ramana Radhakrishnan
 wrote:
> On 6 July 2012 16:52, Mans Rullgard  wrote:
>> I ran my usual set of benchmarks of libav compiled with the current gcc
>> releases (hand-written assembly disabled).  The results are in this
>> spreadsheet:
>> https://docs.google.com/spreadsheet/ccc?key=0AguHvNGaLXy9dHExeWZ1YWZ1c0s2VnpJRkl2bVRPU2c
>>
>> First the good news, almost everything is faster with 4.6+ than with
>> linaro-4.5.
>>
>> The bad news is that some things have regressed since 4.6, even if not
>> all the way back to 4.5 levels.  A few especially problematic pieces
>> stand out:
>>
>> - The mp3 test performs 5-15% worse.  This regression is (mostly)
>>   attributable to the ff_mpadsp_apply_window_fixed [1] function.
>>   We have looked at this one before.
>>
>> - FLAC is 9% slower in upstream 4.7/4.8 compared to Linaro releases.
>>   Here flac_lpc_16_c [2] and flac_decorrelate_indep_c_16 [3] are
>>   mainly to blame.
>>
>
> Looking at this in the middle of the summit - In the flac_lpc_16_c
> code in the vectorized case could you take a look with perf and say
> which part is hot ?
>
> is it the top level nested loop over i and j or is it the loop that
> does a summation when i < len   ?
>
> The non-vectorized case looks interesting because it might be a
> fallout with sched-pressure.

Here's the perf annotate output for that function from 4.8 trunk with
vectorisation enabled:

 Percent |  Source code & Disassembly of avconv

 :
 :
 :
 :  Disassembly of section .text:
 :
 :  002aa55c :
 :  #define SAMPLE_SIZE 32
 :  #include "flacdsp_template.c"
 :
 :  static void flac_lpc_16_c(int32_t *decoded, const int 
coeffs[32],
 :int pred_order, int qlevel, int len)
 :  {
0.02 :2aa55c:   push{r4, r5, r6, r7, r8, r9, sl, fp}
0.00 :2aa560:   sub sp, sp, #80 ; 0x50
0.00 :2aa564:   str r0, [sp, #68]   ; 0x44
 :  int i, j;
 :
 :  for (i = pred_order; i < len - 1; i += 2) {
0.00 :2aa568:   ldr r0, [sp, #112]  ; 0x70
 :  #define SAMPLE_SIZE 32
 :  #include "flacdsp_template.c"
 :
 :  static void flac_lpc_16_c(int32_t *decoded, const int 
coeffs[32],
 :int pred_order, int qlevel, int len)
 :  {
0.00 :2aa56c:   str r2, [sp, #60]   ; 0x3c
0.00 :2aa570:   str r1, [sp, #52]   ; 0x34
 :  int i, j;
 :
 :  for (i = pred_order; i < len - 1; i += 2) {
0.00 :2aa574:   sub r0, r0, #1
 :  #define SAMPLE_SIZE 32
 :  #include "flacdsp_template.c"
 :
 :  static void flac_lpc_16_c(int32_t *decoded, const int 
coeffs[32],
 :int pred_order, int qlevel, int len)
 :  {
0.00 :2aa578:   str r3, [sp, #56]   ; 0x38
 :  int i, j;
 :
 :  for (i = pred_order; i < len - 1; i += 2) {
0.00 :2aa57c:   cmp r2, r0
0.00 :2aa580:   str r0, [sp, #72]   ; 0x48
0.00 :2aa584:   bge 2aa93c 
 :
 :  #undef  SAMPLE_SIZE
 :  #define SAMPLE_SIZE 32
 :  #include "flacdsp_template.c"
 :
 :  static void flac_lpc_16_c(int32_t *decoded, const int 
coeffs[32],
0.00 :2aa588:   add r3, r2, #4
0.00 :2aa58c:   mov r8, r2
0.00 :2aa590:   lsl r3, r3, #2
0.00 :2aa594:   sub r2, r2, #10
0.00 :2aa598:   ldr sl, [sp, #68]   ; 0x44
0.00 :2aa59c:   bic r2, r2, #7
0.00 :2aa5a0:   ldr ip, [sp, #68]   ; 0x44
0.00 :2aa5a4:   mov r0, r1
0.00 :2aa5a8:   rsb r2, r2, r8
0.00 :2aa5ac:   sub r1, r3, #16
0.00 :2aa5b0:   sub r9, r8, #1
0.00 :2aa5b4:   add sl, sl, #16
0.00 :2aa5b8:   add r3, ip, r3
0.00 :2aa5bc:   add r1, r0, r1
0.00 :2aa5c0:   sub r2, r2, #9
0.00 :2aa5c4:   str r9, [sp, #64]   ; 0x40
0.00 :2aa5c8:   str sl, [sp, #44]   ; 0x2c
0.00 :2aa5cc:   str r3, [sp, #36]   ; 0x24
0.00 :2aa5d0:   str r1, [sp, #76]   ; 0x4c
0.00 :2aa5d4:   str r2, [sp, #40]   ; 0x28
0.00 :2aa5d8:   str r8, [sp, #48]   ; 0x30
 :
 :  for (i = pred_order; i < len - 1; i += 2) {
 :  int c;
 :  int d = decoded[i-pred_order];
 : 

Re: Ongoing benchmark graphs

2012-07-16 Thread William Mills

On 07/15/2012 09:51 PM, Michael Hope wrote:

We've just started running a weekly benchmark of GCC trunk and Linaro
GCC tip.  I've written a short script that compares against a baseline
and spits out a graph:
  http://ex.seabright.co.nz/benchmarks/gcc-4.8~svn.png
  http://ex.seabright.co.nz/benchmarks/gcc-linaro-4.7%2bbzr.png



The above links require authorization and do not appear to be visible to 
the public.  Was this the intent?



___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain


Re: Ongoing benchmark graphs

2012-07-16 Thread Michael Hope
On 17 July 2012 05:51, William Mills  wrote:
> On 07/15/2012 09:51 PM, Michael Hope wrote:
>>
>> We've just started running a weekly benchmark of GCC trunk and Linaro
>> GCC tip.  I've written a short script that compares against a baseline
>> and spits out a graph:
>>   http://ex.seabright.co.nz/benchmarks/gcc-4.8~svn.png
>>   http://ex.seabright.co.nz/benchmarks/gcc-linaro-4.7%2bbzr.png
>>
>
> The above links require authorization and do not appear to be visible to the
> public.  Was this the intent?

Unfortunately yes.  The graphs include SPEC and EEMBC results which
are licensed and can't be freely shared.  We can share them with other
licensees so please contact me if you'd like access to this or the
restricted linaro-toolchain-benchmarks mailing list.

It's an unfortunate wart.  We work in the open but there's no good
alternative to these restricted benchmarks.

-- Michael

___
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain