[Bug fortran/88533] [9 Regression] Higher performance penalty of array-bounds checking for sparse-matrix vector multiply

tkoenig at gcc dot gnu.org Mon, 17 Dec 2018 13:47:16 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88533


Thomas Koenig <tkoenig at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tkoenig at gcc dot gnu.org

--- Comment #2 from Thomas Koenig <tkoenig at gcc dot gnu.org> ---
Strange.

I ran the code (with the data) a few times on my Ryzen 7 home
system.

Here are some timings (run 10 times):

$ gfortran -O3 -ftree-vectorize -g  csc_test.f90
$ for a in 1 2 3 4 5 6 7 8 9 10; do ./a.out; done
 CPU time [s]:  1.20
 CPU time [s]:  2.52
 CPU time [s]:  2.53
 CPU time [s]:  2.53
 CPU time [s]:  2.53
 CPU time [s]:  2.53
 CPU time [s]:  2.53
 CPU time [s]:  1.18
 CPU time [s]:  2.49
 CPU time [s]:  2.53
$ gfortran -O3 -ftree-vectorize -fcheck=bounds -g  csc_test.f90
$ for a in 1 2 3 4 5 6 7 8 9 10; do ./a.out; done
 CPU time [s]:  1.28
 CPU time [s]:  2.62
 CPU time [s]:  2.62
 CPU time [s]:  2.60
 CPU time [s]:  2.59
 CPU time [s]:  2.60
 CPU time [s]:  2.60
 CPU time [s]:  2.63
 CPU time [s]:  2.65
 CPU time [s]:  2.57

What strikes me is that I hardly see any slowdown from bounds
checking, and that some runs (only a few) are far faster than
others.

Is it possible that the data size of the problem is just at
the edge of cache size, so that (depending on what else happens
on the system) it is possible to either get a lot of cache misses
or not?

(I made sure to always seed the random number generator with
the same values).

[Bug fortran/88533] [9 Regression] Higher performance penalty of array-bounds checking for sparse-matrix vector multiply

Reply via email to