https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114978

--- Comment #30 from Tianyang Chou <tianyang.chou at gmail dot com> ---
(In reply to Chen Chen from comment #27)
> I am a bit confused with your statement. For AOSC gcc 13.2 I got 8.52 with
> parameters "-g -Ofast -march=la464 -flto", and 8.76 with parameters "-g
> -Ofast -march=la464". These results are similar to yours.
> 
> For gcc 14 snapshot 20240317, currently I configure with the following
> parameters:
> 
> -enable-shared --enable-threads=posix --with-system-zlib
> --enable-gnu-indirect-function --enable-__cxa_atexit
> --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
> --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> -enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> --disable-werror --enable-pie --enable-checking=release
> --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> --enable-default-pie --enable-default-ssp --enable-bootstrap
> --enable-languages=c,c++,fortran,lto --with-abi=lp64d
> --with-arch=loongarch64 --with-tune=la464 --build=loongarch64-aosc-linux-gnu
> --program-suffix=-14.0.1
> 
> And still get a score of 12.8.
> 
> 
> (In reply to Tianyang Chou from comment #24)
> > (In reply to Chen Chen from comment #0)
> > > We tested Loongarch64 CPU Loongson 3A6000 with "LA664" architecture in 
> > > Linux
> > > operating system AOSC OS 11.4.0 (default gcc version is 13.2.0). And we
> > > found the 548.exchange2_r benchmark from SPEC 2017 INTrate suite suffered
> > > significant regressions from 14% to 28% with various compiling options.
> > > 
> > > The rate-1 results are following:
> > > 
> > > after snapshot 20240317 score 14.3-19.3% lower with parameters "-g -Ofast
> > > -march=native":
> > > 13.2.0:    11.7 (223s) [gcc 13.2.0, system default]
> > > 20240317:  11.0 (237s) [gcc 14 snapshot 20240317]
> > > 20240324:  8.88 (295s) [gcc 14 snapshot 20240324]
> > > 20240430:  9.03 (290s) [gcc 14 snapshot 20240430, 14.1.0-RC]
> > > 14.1.0:    9.43 (278s) [gcc 14.1.0 release]
> > > 
> > > after snapshot 20240317 score 16.5-20.8% lower with parameters "-g -Ofast
> > > -march=native -flto": 
> > > 13.2.0:    12.0 (218s)
> > > 20240317:  10.6 (248s)
> > > 20240324:  8.40 (312s)
> > > 20240430:  8.48 (309s)
> > > 14.1.0:    8.85 (296s)
> > > 
> > > 
> > > after snapshot 20240317 score 18-23.1% lower with parameters "-g -Ofast
> > > -march=la664":       
> > > 13.2.0:    "-march=la664" flag is not supported
> > > 20240317:  11.5 (227s)
> > > 20240324:  8.84 (296s)
> > > 20240430:  9.43 (278s)
> > > 14.1.0:    9.42 (278s)
> > > 
> > > 
> > > after snapshot 20240317 score 20.3-21.2% lower with parameters "-g -Ofast
> > > -march=la664 -flto": 
> > > 13.2.0:    "-march=la664" flag is not supported
> > > 20240317:  11.1 (236s)
> > > 20240324:  8.75 (299s)
> > > 20240430:  8.85 (296s)
> > > 14.1.0:    8.85 (296s)
> > > 
> > > 
> > > after snapshot 20240317 score 26.3-26.6% lower with parameters "-g -Ofast
> > > -march=la464":       
> > > 13.2.0:    8.76 (299s)
> > > 20240317:  12.8 (205s)
> > > 20240324:  9.39 (279s)
> > > 20240430:  9.43 (278s)
> > > 14.1.0:    9.43 (278s)
> > > 
> > > 
> > > after snapshot 20240317 score 26.6-28% lower with parameters "-g -Ofast
> > > -march=la464 -flto": 
> > > 13.2.0:    8.52 (307s)
> > > 20240317:  12.8 (204s)
> > > 20240324:  9.22 (284s)
> > > 20240430:  9.37 (280s)
> > > 14.1.0:    9.40 (279s)
> > > 
> > > 
> > > The gcc 14 snapshots and gcc 14.1.0 are compiled with the following
> > > parameters: 
> > > 
> > > --enable-shared --enable-threads=posix --with-system-zlib
> > > --enable-gnu-indirect-function --enable-__cxa_atexit
> > > --disable-libunwind-exceptions --enable-clocale=gnu 
> > > --disable-libstdcxx-pch
> > > --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> > > --enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> > > --disable-werror --enable-pie --enable-checking=release
> > > --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> > > --enable-default-pie --enable-default-ssp --enable-bootstrap
> > > --enable-languages=c,c++,fortran,lto --with-abi=lp64d
> > > --with-arch=loongarch64 --with-tune=la664 
> > > --build=loongarch64-aosc-linux-gnu
> > > 
> > > 
> > > The regression may be found on other types of CPUs as well. We did a quick
> > > test on AMD Zen4 CPU R9 7940HS and found similar but smaller regression:
> > > 
> > > The rate-1 results on x86_64 (AMD R9 7940HS) with operating system Debian 
> > > 12:
> > > 
> > > after snapshot 20240317 score 8.6-9.6% lower with parameters "-m64 -g 
> > > -Ofast
> > > -march=znver3":
> > > 12.2.0:    30.1 (87.0s) [gcc 12.2.0, system default]
> > > 13.2.0:    30.6 (85.7s) [gcc 13.2 release]
> > > 20240317:  31.4 (83.3s) [gcc 14 snapshot]
> > > 20240324:  28.7 (91.2s) [gcc 14 snapshot]
> > > 20240430:  28.4 (92.2s) [gcc 14 snapshot, 14.1.0-RC]
> > > 
> > > after snapshot 20240317 score 10% lower with parameters "-m64 -g -Ofast
> > > -march=znver3 -flto":
> > > 12.2.0:    29.0 (90.3s) 
> > > 13.2.0:    30.9 (84.9s) 
> > > 20240317:  32.0 (81.8s) 
> > > 20240324:  28.8 (90.9s) 
> > > 20240430:  28.8 (91.1s)
> > > 
> > > gcc13 and gcc14 are compiled with the following parameters:
> > > 
> > > --enable-shared --enable-threads=posix --with-system-zlib
> > > --enable-gnu-indirect-function --enable-__cxa_atexit
> > > --disable-libunwind-exceptions --enable-clocale=gnu 
> > > --disable-libstdcxx-pch
> > > --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> > > --enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> > > --disable-werror --enable-pie --enable-checking=release
> > > --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> > > --enable-default-pie --enable-default-ssp --enable-bootstrap
> > > --enable-languages=c,c++,fortran,lto  --build=x86_64-linux-gnu
> > > --host=x86_64-linux-gnu --target=x86_64-linux-gnu
> > 
> > Sorry to talk about something unrelated to this bug. I tried running 548 on
> > CPU loongson 3A6000 with the same compiler version and compiler options as
> > you but the score is only 8.5,  so could you please tell me what am I
> > missing? I just can't reproduce your performance result.
> > 
> > The gcc compiler source code is downloaded from the github repo
> > AOSC-Tracking/gcc(13.2.0), configure it with parameters:
> > 
> > "--enable-shared --enable-threads=posix --with-system-zlib
> > --enable-gnu-indirect-function --enable-__cxa_atexit
> > --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
> > --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> > --enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> > --disable-werror --enable-pie --enable-checking=release
> > --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> > --enable-default-pie --enable-default-ssp --enable-bootstrap
> > --enable-languages=c,c++,fortran,lto --with-abi=lp64d
> > --with-arch=loongarch64 --with-tune=la664 
> > --build=loongarch64-aosc-linux-gnu"
> 
> (In reply to Tianyang Chou from comment #24)
> > (In reply to Chen Chen from comment #0)
> > > We tested Loongarch64 CPU Loongson 3A6000 with "LA664" architecture in 
> > > Linux
> > > operating system AOSC OS 11.4.0 (default gcc version is 13.2.0). And we
> > > found the 548.exchange2_r benchmark from SPEC 2017 INTrate suite suffered
> > > significant regressions from 14% to 28% with various compiling options.
> > > 
> > > The rate-1 results are following:
> > > 
> > > after snapshot 20240317 score 14.3-19.3% lower with parameters "-g -Ofast
> > > -march=native":
> > > 13.2.0:    11.7 (223s) [gcc 13.2.0, system default]
> > > 20240317:  11.0 (237s) [gcc 14 snapshot 20240317]
> > > 20240324:  8.88 (295s) [gcc 14 snapshot 20240324]
> > > 20240430:  9.03 (290s) [gcc 14 snapshot 20240430, 14.1.0-RC]
> > > 14.1.0:    9.43 (278s) [gcc 14.1.0 release]
> > > 
> > > after snapshot 20240317 score 16.5-20.8% lower with parameters "-g -Ofast
> > > -march=native -flto": 
> > > 13.2.0:    12.0 (218s)
> > > 20240317:  10.6 (248s)
> > > 20240324:  8.40 (312s)
> > > 20240430:  8.48 (309s)
> > > 14.1.0:    8.85 (296s)
> > > 
> > > 
> > > after snapshot 20240317 score 18-23.1% lower with parameters "-g -Ofast
> > > -march=la664":       
> > > 13.2.0:    "-march=la664" flag is not supported
> > > 20240317:  11.5 (227s)
> > > 20240324:  8.84 (296s)
> > > 20240430:  9.43 (278s)
> > > 14.1.0:    9.42 (278s)
> > > 
> > > 
> > > after snapshot 20240317 score 20.3-21.2% lower with parameters "-g -Ofast
> > > -march=la664 -flto": 
> > > 13.2.0:    "-march=la664" flag is not supported
> > > 20240317:  11.1 (236s)
> > > 20240324:  8.75 (299s)
> > > 20240430:  8.85 (296s)
> > > 14.1.0:    8.85 (296s)
> > > 
> > > 
> > > after snapshot 20240317 score 26.3-26.6% lower with parameters "-g -Ofast
> > > -march=la464":       
> > > 13.2.0:    8.76 (299s)
> > > 20240317:  12.8 (205s)
> > > 20240324:  9.39 (279s)
> > > 20240430:  9.43 (278s)
> > > 14.1.0:    9.43 (278s)
> > > 
> > > 
> > > after snapshot 20240317 score 26.6-28% lower with parameters "-g -Ofast
> > > -march=la464 -flto": 
> > > 13.2.0:    8.52 (307s)
> > > 20240317:  12.8 (204s)
> > > 20240324:  9.22 (284s)
> > > 20240430:  9.37 (280s)
> > > 14.1.0:    9.40 (279s)
> > > 
> > > 
> > > The gcc 14 snapshots and gcc 14.1.0 are compiled with the following
> > > parameters: 
> > > 
> > > --enable-shared --enable-threads=posix --with-system-zlib
> > > --enable-gnu-indirect-function --enable-__cxa_atexit
> > > --disable-libunwind-exceptions --enable-clocale=gnu 
> > > --disable-libstdcxx-pch
> > > --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> > > --enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> > > --disable-werror --enable-pie --enable-checking=release
> > > --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> > > --enable-default-pie --enable-default-ssp --enable-bootstrap
> > > --enable-languages=c,c++,fortran,lto --with-abi=lp64d
> > > --with-arch=loongarch64 --with-tune=la664 
> > > --build=loongarch64-aosc-linux-gnu
> > > 
> > > 
> > > The regression may be found on other types of CPUs as well. We did a quick
> > > test on AMD Zen4 CPU R9 7940HS and found similar but smaller regression:
> > > 
> > > The rate-1 results on x86_64 (AMD R9 7940HS) with operating system Debian 
> > > 12:
> > > 
> > > after snapshot 20240317 score 8.6-9.6% lower with parameters "-m64 -g 
> > > -Ofast
> > > -march=znver3":
> > > 12.2.0:    30.1 (87.0s) [gcc 12.2.0, system default]
> > > 13.2.0:    30.6 (85.7s) [gcc 13.2 release]
> > > 20240317:  31.4 (83.3s) [gcc 14 snapshot]
> > > 20240324:  28.7 (91.2s) [gcc 14 snapshot]
> > > 20240430:  28.4 (92.2s) [gcc 14 snapshot, 14.1.0-RC]
> > > 
> > > after snapshot 20240317 score 10% lower with parameters "-m64 -g -Ofast
> > > -march=znver3 -flto":
> > > 12.2.0:    29.0 (90.3s) 
> > > 13.2.0:    30.9 (84.9s) 
> > > 20240317:  32.0 (81.8s) 
> > > 20240324:  28.8 (90.9s) 
> > > 20240430:  28.8 (91.1s)
> > > 
> > > gcc13 and gcc14 are compiled with the following parameters:
> > > 
> > > --enable-shared --enable-threads=posix --with-system-zlib
> > > --enable-gnu-indirect-function --enable-__cxa_atexit
> > > --disable-libunwind-exceptions --enable-clocale=gnu 
> > > --disable-libstdcxx-pch
> > > --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> > > --enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> > > --disable-werror --enable-pie --enable-checking=release
> > > --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> > > --enable-default-pie --enable-default-ssp --enable-bootstrap
> > > --enable-languages=c,c++,fortran,lto  --build=x86_64-linux-gnu
> > > --host=x86_64-linux-gnu --target=x86_64-linux-gnu
> > 
> > Sorry to talk about something unrelated to this bug. I tried running 548 on
> > CPU loongson 3A6000 with the same compiler version and compiler options as
> > you but the score is only 8.5,  so could you please tell me what am I
> > missing? I just can't reproduce your performance result.
> > 
> > The gcc compiler source code is downloaded from the github repo
> > AOSC-Tracking/gcc(13.2.0), configure it with parameters:
> > 
> > "--enable-shared --enable-threads=posix --with-system-zlib
> > --enable-gnu-indirect-function --enable-__cxa_atexit
> > --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch
> > --disable-libssp --enable-gnu-unique-object --enable-linker-build-id
> > --enable-lto --enable-plugin --enable-install-libiberty --disable-multilib
> > --disable-werror --enable-pie --enable-checking=release
> > --enable-libstdcxx-dual-abi --with-default-libstdcxx-abi=new
> > --enable-default-pie --enable-default-ssp --enable-bootstrap
> > --enable-languages=c,c++,fortran,lto --with-abi=lp64d
> > --with-arch=loongarch64 --with-tune=la664 
> > --build=loongarch64-aosc-linux-gnu"

I mean score 8.5 for GCC-13.2.0 is not as expected, instead it should be 11.7
which mentioned in Comment 1: 

13.2.0:    11.7 (223s) [gcc 13.2.0, system default]

but the problem solved for me now as I apply the patch(Comment 19) to
GCC-13.2.0 and the performance increase to 11.7.

Reply via email to