https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104122

            Bug ID: 104122
           Summary: On Zen3, 510.parest_r (built with -Ofast) is faster
                    with generic than with native tuning
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jamborm at gcc dot gnu.org
                CC: hubicka at gcc dot gnu.org
            Blocks: 26163
  Target Milestone: ---
              Host: x86_64-linux
            Target: x86_64-linux

On Zen3 based CPUs, benchmark 510.parest_r from the SPEC 2017 FPrate is faster
with -march=generic than with -march=native.  LNT reports 11% regression:

 
https://lnt.opensuse.org/db_default/v4/SPEC/graph?plot.0=463.457.0&plot.1=471.457.0&;

However, my own measurements on a different but similar EPYC machine suggest it
can be as high as 26%.  On a yet another Ryzen machine I can see almost 10%
too.  I only have older-than-LNT data from the Ryzen machine and we did not see
the regression when gcc 11 was released.  However it seems that the generic
tuning improved while the native one did not.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
[Bug 26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

Reply via email to