https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

Tianyang Zhou <tianyang.chou at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tianyang.chou at gmail dot com

--- Comment #24 from Tianyang Zhou <tianyang.chou at gmail dot com> ---
(In reply to Chen Chen from comment #0)
> We tested Loongarch64 CPU Loongson 3A6000 with "LA664" architecture in Linux
> operating system AOSC OS 11.4.0 (default gcc version is 13.2.0). And we
> found the 548.exchange2_r benchmark from SPEC 2017 INTrate suite suffered
> significant regressions from 14% to 28% with various compiling options.
> 
> The rate-1 results are following:
> 
> after snapshot 20240317 score 14.3-19.3% lower with parameters "-g -Ofast
> -march=native":
> 13.2.0:    11.7 (223s) [gcc 13.2.0, system default]
> 20240317:  11.0 (237s) [gcc 14 snapshot 20240317]
> 20240324:  8.88 (295s) [gcc 14 snapshot 20240324]
> 20240430:  9.03 (290s) [gcc 14 snapshot 20240430, 14.1.0-RC]
> 14.1.0:    9.43 (278s) [gcc 14.1.0 release]
> 
> after snapshot 20240317 score 16.5-20.8% lower with parameters "-g -Ofast
> -march=native -flto": 
> 13.2.0:    12.0 (218s)
> 20240317:  10.6 (248s)
> 20240324:  8.40 (312s)
> 20240430:  8.48 (309s)
> 14.1.0:    8.85 (296s)
> 
> 
> after snapshot 20240317 score 18-23.1% lower with parameters "-g -Ofast
> -march=la664":       
> 13.2.0:    "-march=la664" flag is not supported
> 20240317:  11.5 (227s)
> 20240324:  8.84 (296s)
> 20240430:  9.43 (278s)
> 14.1.0:    9.42 (278s)
> 
> 
> after snapshot 20240317 score 20.3-21.2% lower with parameters "-g -Ofast
> -march=la664 -flto": 
> 13.2.0:    "-march=la664" flag is not supported
> 20240317:  11.1 (236s)
> 20240324:  8.75 (299s)
> 20240430:  8.85 (296s)
> 14.1.0:    8.85 (296s)
> 
> 
> after snapshot 20240317 score 26.3-26.6% lower with parameters "-g -Ofast
> -march=la464":       
> 13.2.0:    8.76 (299s)
> 20240317:  12.8 (205s)
> 20240324:  9.39 (279s)
> 20240430:  9.43 (278s)
> 14.1.0:    9.43 (278s)
> 
> 
> after snapshot 20240317 score 26.6-28% lower with parameters "-g -Ofast
> -march=la464 -flto": 
> 13.2.0:    8.52 (307s)
> 20240317:  12.8 (204s)
> 20240324:  9.22 (284s)
> 20240430:  9.37 (280s)
> 14.1.0:    9.40 (279s)
> 
> 
> The gcc 14 snapshots and gcc 14.1.0 are compiled with the following
> parameters: 
> 
> --enable-shared --enable-threads=posix --with-system-zlib
> --enable-gnu-indirect-function --enable-__cxa_atexit
> --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
> --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> --enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> --disable-werror --enable-pie --enable-checking=release
> --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> --enable-default-pie --enable-default-ssp --enable-bootstrap
> --enable-languages=c,c++,fortran,lto --with-abi=lp64d
> --with-arch=loongarch64 --with-tune=la664 --build=loongarch64-aosc-linux-gnu
> 
> 
> The regression may be found on other types of CPUs as well. We did a quick
> test on AMD Zen4 CPU R9 7940HS and found similar but smaller regression:
> 
> The rate-1 results on x86_64 (AMD R9 7940HS) with operating system Debian 12:
> 
> after snapshot 20240317 score 8.6-9.6% lower with parameters "-m64 -g -Ofast
> -march=znver3":
> 12.2.0:    30.1 (87.0s) [gcc 12.2.0, system default]
> 13.2.0:    30.6 (85.7s) [gcc 13.2 release]
> 20240317:  31.4 (83.3s) [gcc 14 snapshot]
> 20240324:  28.7 (91.2s) [gcc 14 snapshot]
> 20240430:  28.4 (92.2s) [gcc 14 snapshot, 14.1.0-RC]
> 
> after snapshot 20240317 score 10% lower with parameters "-m64 -g -Ofast
> -march=znver3 -flto":
> 12.2.0:    29.0 (90.3s) 
> 13.2.0:    30.9 (84.9s) 
> 20240317:  32.0 (81.8s) 
> 20240324:  28.8 (90.9s) 
> 20240430:  28.8 (91.1s)
> 
> gcc13 and gcc14 are compiled with the following parameters:
> 
> --enable-shared --enable-threads=posix --with-system-zlib
> --enable-gnu-indirect-function --enable-__cxa_atexit
> --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
> --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> --enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> --disable-werror --enable-pie --enable-checking=release
> --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> --enable-default-pie --enable-default-ssp --enable-bootstrap
> --enable-languages=c,c++,fortran,lto  --build=x86_64-linux-gnu
> --host=x86_64-linux-gnu --target=x86_64-linux-gnu

Sorry to talk about something unrelated to this bug. I tried running 548 on CPU
loongson 3A6000 with the same compiler version and compiler options as you but
the score is only 8.5,  so could you please tell me what am I missing? I just
can't reproduce your performance result.

The gcc compiler source code is downloaded from the github repo
AOSC-Tracking/gcc(13.2.0), configure it with parameters:

"--enable-shared --enable-threads=posix --with-system-zlib
--enable-gnu-indirect-function --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
--disable-libssp --enable-gnu-unique-object --enable-linker-build-id
--enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
--disable-werror --enable-pie --enable-checking=release
--enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
--enable-default-pie --enable-default-ssp --enable-bootstrap
--enable-languages=c,c++,fortran,lto --with-abi=lp64d --with-arch=loongarch64
--with-tune=la664 --build=loongarch64-aosc-linux-gnu"

Reply via email to