https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69274

--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
Key assembly difference seems to be extra reg-reg copies around a loop.  But
maybe
perf lies to me (the description cites fsettle as the real offender but perf
points me to inl1130).  As I can reproduce the isssue even w/o
-fschedule-insns2
I doubted the issue was scheduling related (but I didn't try if
-fschedule-insns fixed things).

Both loops are rather large though (and vectorized).

perf for me shows (reproducible)

Samples: 2M of event 'cycles', Event count (approx.): 2251122016217             
 31.85%  gromacs_base.am  gromacs_base.amd64-m64-gcc42-nn  [.] inl1130_       

 24.64%  gromacs_peak.am  gromacs_peak.amd64-m64-gcc42-nn  [.] inl1130_       

  6.46%  gromacs_base.am  gromacs_base.amd64-m64-gcc42-nn  [.]
search_neighbour
  6.39%  gromacs_peak.am  gromacs_peak.amd64-m64-gcc42-nn  [.]
search_neighbour
  1.70%  gromacs_base.am  gromacs_base.amd64-m64-gcc42-nn  [.] inl1100_       

  1.69%  gromacs_peak.am  gromacs_peak.amd64-m64-gcc42-nn  [.] inl1100_       

  0.96%  gromacs_base.am  gromacs_base.amd64-m64-gcc42-nn  [.] inl1000_       

  0.96%  gromacs_peak.am  gromacs_peak.amd64-m64-gcc42-nn  [.] inl1000_       

  0.86%  gromacs_base.am  gromacs_base.amd64-m64-gcc42-nn  [.] inl0100_       

  0.85%  gromacs_peak.am  gromacs_peak.amd64-m64-gcc42-nn  [.] inl0100_       

  0.75%  gromacs_base.am  gromacs_base.amd64-m64-gcc42-nn  [.] inl1120_       

  0.70%  gromacs_peak.am  gromacs_peak.amd64-m64-gcc42-nn  [.] fsettle_       

  0.68%  gromacs_base.am  gromacs_base.amd64-m64-gcc42-nn  [.] fsettle_       

  0.65%  gromacs_peak.am  gromacs_peak.amd64-m64-gcc42-nn  [.] inl1120_       

  0.61%  gromacs_base.am  gromacs_base.amd64-m64-gcc42-nn  [.] update         

  0.61%  gromacs_peak.am  gromacs_peak.amd64-m64-gcc42-nn  [.] update

thus not fsettle but the attached.

BASE separated:

          Samples: 1M of event 'cycles', Event count (approx.): 1045723281935,
DSO: gromac
 68.56%  gromacs_base.am  [.] inl1130_                                        

 13.92%  gromacs_base.am  [.] search_neighbours                               

  3.65%  gromacs_base.am  [.] inl1100_                                        

  2.07%  gromacs_base.am  [.] inl1000_                                        

  1.84%  gromacs_base.am  [.] inl0100_                                        

  1.61%  gromacs_base.am  [.] inl1120_                                        

  1.47%  gromacs_base.am  [.] fsettle_          

PEAK separated:

Samples: 1M of event 'cycles', Event count (approx.): 879799417599, DSO:
gromacs
 63.05%  gromacs_peak.am  [.] inl1130_                                        

 16.35%  gromacs_peak.am  [.] search_neighbours                               

  4.33%  gromacs_peak.am  [.] inl1100_                                        

  2.45%  gromacs_peak.am  [.] inl1000_                                        

  2.18%  gromacs_peak.am  [.] inl0100_                                        

  1.79%  gromacs_peak.am  [.] fsettle_                                        

  1.66%  gromacs_peak.am  [.] inl1120_                                        

  1.56%  gromacs_peak.am  [.] update                                          

  1.56%  gromacs_peak.am  [.] put_in_list.constprop.14

Reply via email to