On 6/5/25 6:04 PM, Vineet Gupta wrote:
Changes since v1
   - Dropped removal of TARGET_MODE_AFTER
   - NFC changes to last 2 patches, with reattribution to PRs they address 
seperately.

Hi,

This came out of Rivos perf team reporting (shoutout to Siavash) that
some of the SPEC2017 workloads had unnecessary FRM wiggles, when
none were needed. The writes in particular could be expensive.

I started with reduced test for PR/119164 from blender:node_testure_util.c.

However in trying to understand (and a botched rewrite of whole thing)
it turned out that lot of code was just unnecessary leading to more
complexity than warranted. As a result there are more deletions here and
the actual improvements come from just a few lines of actual changes.

I've verified each patch incrementally with
  - Testsuite run (unchanged, 1 unexpected pass 
gcc.target/riscv/rvv/autovec/pr119114.c)
  - SPEC build
  - Static analysis of FRM read/write insns emitted in all of SPEC binaries.
  - There's BPI date for some of this too, but the delta there is not
    significant as this could really be uarch specific.

Here's the result for static analysis.

             1. revert-confluence  2. remove-edge-insert  4-fewer-frm-restore  
5-call-backtrack
               -------------------  --------------------  -------------------  
---------------
                 frrm fsrmi fsrm       frrm fsrmi fsrm       frrm fsrmi fsrm    
 frrm fsrmi fsrm
     perlbench_r   42    0    4          42    0    4          17    0    1     
   17    0    1
        cpugcc_r  167    0   17         167    0   17          11    0    0     
   11    0    0
        bwaves_r   16    0    1          16    0    1          16    0    1     
   16    0    1
           mcf_r   11    0    0          11    0    0          11    0    0     
   11    0    0
    cactusBSSN_r   79    0   27          76    0   27          19    0    1     
   19    0    1
          namd_r  119    0   63         119    0   63          14    0    1     
   14    0    1
        parest_r  218    0  114         168    0  114          24    0    1     
   24    0    1
        povray_r  123    1   17         123    1   17          26    1    6     
   26    1    6
           lbm_r    6    0    0           6    0    0           6    0    0     
    6    0    0
       omnetpp_r   17    0    1          17    0    1          17    0    1     
   17    0    1
           wrf_r 2287   13 1956        2287   13 1956        1268   13 1603     
  613   13   82
      cpuxalan_r   17    0    1          17    0    1          17    0    1     
   17    0    1
        ldecod_r   11    0    0          11    0    0          11    0    0     
   11    0    0
          x264_r   14    0    1          14    0    1          11    0    0     
   11    0    0
       blender_r  724   12  182         724   12  182          61   12   42     
   39   12   16
          cam4_r  324   13  169         324   13  169          45   13   20     
   40   13   17
     deepsjeng_r   11    0    0          11    0    0          11    0    0     
   11    0    0
       imagick_r  265   16   34         265   16   34         132   16   25     
   33   16   18
         leela_r   12    0    0          12    0    0          12    0    0     
   12    0    0
           nab_r   13    0    1          13    0    1          13    0    1     
   13    0    1
     exchange2_r   16    0    1          16    0    1          16    0    1     
   16    0    1
     fotonik3d_r   20    0   11          20    0   11          19    0    1     
   19    0    1
          roms_r   33    0   23          33    0   23          21    0    1     
   21    0    1
            xz_r    6    0    0           6    0    0           6    0    0     
    6    0    0
               --------------------  -------------------  -------------------  
----------------
                 4551   55 2623        4498   55 2623        1804   55 1707     
 1023   55  150
               --------------------  -------------------  -------------------  
----------------
                           7729                  7176                  3566     
           1228
               --------------------  -------------------  -------------------  
----------------

Note that wrf still has ridiculously high number of FRM ops which will be 
tackled as a follow-up.
#1-#4 are still OK.  I just need to work my way through the last one ;-)

jeff

Reply via email to