[Bug rtl-optimization/23785] New: 197.parser performance drop

uttamp at us dot ibm dot com Thu, 08 Sep 2005 15:05:36 -0700

I normally run nightly spec cpu2000 benchmark testing with main line GCC. I've
seen few  performance drops since July 29 2005. Currently I'm analysing
197.parser benchmark which had 15 points drop since the previous run on 28th
July. I've located following two patches 
1) http://gcc.gnu.org/ml/gcc-cvs/2005-07/msg01016.html - 8 point drop
2) http://gcc.gnu.org/ml/gcc-cvs/2005-07/msg01034.html - 7 point drop


and verified that these patches have caused the drop. After looking at the 1st
patch, I found that the new assignment statement (node->global.estimated_growth
= INT_MIN) in function update_caller_keys made the difference. The rest of the
patch didn't matter. I've not studied the 2nd patch yet.

With only first patch applied, I found four (4) of the object files in
197.parser benchmark are different in code fragment and size compare to one
without the patch.
Those files are as follows,
post-process.o: +260 bytes with patch compare to without patch
prune.o: -672 bytes with patch
read-dict.o : -336 bytes with patch
utilities.o : +80 bytes

Following is the difference of object code fragment of post-process.o with and
without patch.

WITH THE PATCH                           WITHOUT PATCH
<post_process>:                         <post_process>:
    mflr    r3                  mflr    r3
    stwu    r1,-576(r1)         stwu    r1,-576(r1)
    lis r9,0                    lis r9,0
    stw r3,580(r1)              stw r3,580(r1)
    stw r18,520(r1)             stw r18,520(r1)
    stw r19,524(r1)             stw r19,524(r1)
    stw r20,528(r1)             stw r20,528(r1)
    stw r21,532(r1)             stw r21,532(r1)
    stw r22,536(r1)             stw r22,536(r1)
    stw r23,540(r1)             stw r23,540(r1)
    stw r24,544(r1)             stw r24,544(r1)
    stw r25,548(r1)             stw r25,548(r1)
    stw r26,552(r1)             stw r26,552(r1)
    stw r27,556(r1)             stw r27,556(r1)
    stw r28,560(r1)             stw r28,560(r1)
    stw r29,564(r1)             stw r29,564(r1)
    stw r30,568(r1)             stw r30,568(r1)
    stw r31,572(r1)             stw r31,572(r1)
    lwz r0,0(r9)                lwz r0,0(r9)
    cmpwi   cr7,r0,0            cmpwi   cr7,r0,0
    bne-  cr7,5a9c <post_process+0x1ec>     bne-  cr7,5a9c <post_process+0x1ec>
    lis r29,0                                   lis r29,0
    li  r3,8                                    li  r3,8
    bl  590c <post_process+0x5c>                bl  590c <post_process+0x5c>
    lwz r4,0(r29)                               lwz r4,0(r29)
    mr  r22,r3                            |     mr  r23,r3
    rlwinm  r3,r4,2,0,29                        rlwinm  r3,r4,2,0,29
    bl  591c <post_process+0x6c>                bl  591c <post_process+0x6c>
    lwz r11,0(r29)                              lwz r11,0(r29)
    stw r3,0(r22)                         |     stw r3,0(r23)
    cmpwi   r11,0                               cmpwi   r11,0
    ble-    5a48 <post_process+0x198>           ble-    5a48 
<post_process+0x198>
    li  r31,1                                   li  r31,1
    li  r30,0                                   li  r30,0
    addi    r10,r11,-1                          addi    r10,r11,-1
    cmpw    cr6,r31,r11                   |     cmpw    cr1,r31,r11
    stw r30,0(r3)                               stw r30,0(r3)
    clrlwi  r0,r10,29                           clrlwi  r0,r10,29
    beq-    cr6,5a48 <post_process+0x198> |     beq-    cr1,5a48
<post_process+0x198>
    cmpwi   cr7,r0,0                      |     cmpwi   r0,0
    beq-    cr7,59dc <post_process+0x12c> |     beq-    59dc 
<post_process+0x12c>
    cmpwi   r0,1                          |     cmpwi   cr7,r0,1
    beq-    59c8 <post_process+0x118>     |     beq-    cr7,59c8
<post_process+0x118>
    cmpwi   cr1,r0,2                      |     cmpwi   cr6,r0,2
    beq-    cr1,59bc <post_process+0x10c> |     beq-    cr6,59bc
<post_process+0x10c>
    cmpwi   cr6,r0,3                      |     cmpwi   cr1,r0,3
    beq-    cr6,59b0 <post_process+0x100> |     beq-    cr1,59b0
<post_process+0x100>
    cmpwi   cr7,r0,4                      |     cmpwi   r0,4
    beq-    cr7,59a4 <post_process+0xf4>  |     beq-    59a4 <post_process+0xf4>
    stw r30,4(r3)                               stw r30,4(r3)
    li  r31,2                                   li  r31,2
    rlwinm  r21,r31,2,0,29                |     rlwinm  r22,r31,2,0,29
    addi    r31,r31,1                           addi    r31,r31,1
    stwx    r30,r21,r3                    |     stwx    r30,r22,r3
    rlwinm  r19,r31,2,0,29                |     rlwinm  r18,r31,2,0,29
    addi    r31,r31,1                           addi    r31,r31,1

 I've lot more data. I've also taken the dump with -fdump-ipa-cgraph of the
benchmark with and without patch. I'll add it later if need it.

Thanks.

-- 
           Summary: 197.parser performance drop
           Product: gcc
           Version: 4.1.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: rtl-optimization
        AssignedTo: unassigned at gcc dot gnu dot org
        ReportedBy: uttamp at us dot ibm dot com
                CC: gcc-bugs at gcc dot gnu dot org
 GCC build triplet: powerpc-linux
  GCC host triplet: powerpc-linux
GCC target triplet: powerpc-linux


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23785

[Bug rtl-optimization/23785] New: 197.parser performance drop

Reply via email to