git gcc-verify question
Does anyone know what this is about? $ git gcc-verify Checking 918fcaf0cbf833063c45805ef893cfa2c9ebc875: OK Exception ignored in: Traceback (most recent call last): File "/usr/lib/python3.13/site-packages/git/cmd.py", line 563, in __del__ File "/usr/lib/python3.13/site-packages/git/cmd.py", line 544, in _terminate File "/usr/lib64/python3.13/subprocess.py", line 2227, in terminate ImportError: sys.meta_path is None, Python is likely shutting down I am on Fedora 41 just updated. Jerry
Current trunk fails to build gmp
Since some time in November last year trunk fails to build gmp. Last successfull build was 30th of October last year. The issue seems to be a failing configure test. From config.log: Test compile: long long reliability test 1 configure:6585: /opt/devel/gnu/gcc/Linux/x86_64-pc-linux-gnu/Ubuntu_22.04/gcc-15.0.0-standard/bin/gcc -O2 -pedantic -fomit-frame-pointer -m64 conftest.c >&5 conftest.c: In function 'f': conftest.c:12:48: error: too many arguments to function 'g'; expected 0, have 6 12 | for(i=0;i<1;i++){if(e(got,got,9,d[i].n)==0)h();g(i,d[i].src,d[i].n,got,d[i].want,9);if(d[i].n)h();}} |^ ~ conftest.c:7:6: note: declared here 7 | void g(){} | ^ configure:6588: $? = 1 failed program was: /* The following provokes a segfault in the compiler on powerpc-apple-darwin. Extracted from tests/mpn/t-iord_u.c. Causes Apple's gcc 3.3 build 1640 and 1666 to segfault with e.g., -O2 -mpowerpc64. */ #if defined (__GNUC__) && ! defined (__cplusplus) typedef unsigned long long t1;typedef t1*t2; void g(){} void h(){} static __inline__ t1 e(t2 rp,t2 up,int n,t1 v0) {t1 c,x,r;int i;if(v0){c=1;for(i=1;ivoid f(){static const struct{t1 n;t1 src[9];t1 want[9];}d[]={{1,{0},{1}},};t1 got[9];int i; for(i=0;i<1;i++){if(e(got,got,9,d[i].n)==0)h();g(i,d[i].src,d[i].n,got,d[i].want,9);if(d[i].n)h();}} #else int dummy; #endif int main () { return 0; } configure:7072: result: no, long long reliability test 1 Any comments? I can open a PR if neccessary. Rainer OpenPGP_0x917D882CE22A6AD2_and_old_rev.asc Description: OpenPGP public key OpenPGP_signature.asc Description: OpenPGP digital signature
Re: 22% degradation seen in embench:matmult-int
“the interchanged loop might for example no longer vectorize.” The loops are not vectorized. Which is ok, because this device doesn’t have the support for it. I just don’t think a pass could single handedly make code slower that much. Loop interchange is supposed to interchange the loop nest index with outer index to improve cache locality. This is supposed to help -that is the next iteration we will have the data available in cache. The benchmark source –and the loop that gets interchanged is line 143 Source: https://github.com/embench/embench-iot/blob/master/src/matmult-int/matmult-int.c#L143 This loop is where most of the time is spent. But it would have been good if I had access to h/w tracing to see if the interchanged loop reduces cache misses as well as to see what is causing it to run this much slower. Thanks for your reply! From: Richard Biener Date: Thursday, February 13, 2025 at 2:57 AM To: Visda Vokhshoori - C51841 Cc: gcc@gcc.gnu.org Subject: Re: 22% degradation seen in embench:matmult-int [You don't often get email from richard.guent...@gmail.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe On Wed, Feb 12, 2025 at 4:38 PM Visda.Vokhshoori--- via Gcc wrote: > > Embench is used for benchmarking on embedded devices. > This one project matmult-int has a function Multiply. It’s a matrix > multiplication for 20 x 20 matrix. > The device is a ATSAME70Q21B which is Cortex-M7 > The compiler is arm branch based on GCC version 13 > We are compiling with O3 which has loop-interchange pass on by default. > > When we compile with -fno-loop-interchange we get all 22% back plus 5% speed > up. > > When we do the loop interchange on the one loop nest that get interchanged it > is slightly (.7%) faster. > > Has anyone else seen large degradation as a result of loop interchange? I would suggest to compare the -fopt-info diagnostic output with and without -fno-loop-interchange, the interchanged loop might for example no longer vectorize. Other than that - no, loop interchange isn't applied very often and it has a very conservative cost model. Are you able to share a testcase? Richard. > > Thanks
gcc-12-20250213 is now available
Snapshot gcc-12-20250213 is now available on https://gcc.gnu.org/pub/gcc/snapshots/12-20250213/ and on various mirrors, see https://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 12 git branch with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-12 revision c72f9c0a3ad8eefd0706957ba054c2c2f388d3d5 You'll find: gcc-12-20250213.tar.xz Complete GCC SHA256=40a960056dada322b74c706ef762b2cecfdf168120b29862a5271190c21e8354 SHA1=535427df282b985fe63846c911e798259655dd35 Diffs from 12-20250206 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-12 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Sourceware Open Valentine Office Friday, Feb 14, 16:00 UTC
Friday Feb 14, 16:00 UTC At #overseers on irc.libera.chat To get the right time in your local timezone: $ date -d "Fri Feb 14 16:00 UTC 2025" Valentine's day. Lets show our shared infrastructure some love! - Got issues with the new process/service isolation and/or the DDos protections? Please let us know! - What is the status of the Forge experiment? - Need help setting up secure development policies? Just ask! - Patchwork workflow? Updating CI jobs, autoregen scripts? Documentation snapshots? Lets hack together! Sourceware relies on cooperation among a broad diversity of core toolchain and developer tool projects, hackers, organizations, ideas, and communication styles. The monthly Sourceware Open Office meetings are one way of coming together as a community and discuss our shared development infrastructure. For other ways to participate see https://sourceware.org/mission.html#organization
Re: Current trunk fails to build gmp
Rainer Emrich writes: > Since some time in November last year trunk fails to build gmp. > Last successfull build was 30th of October last year. > > The issue seems to be a failing configure test. From config.log: Please see https://gmplib.org/list-archives/gmp-bugs/2024-November/005550.html. Building GMP with -std=gnu17 is a workaround.
Re: 22% degradation seen in embench:matmult-int
On Thu, Feb 13, 2025 at 9:30 PM wrote: > > > > “the interchanged loop might for example no longer vectorize.” > > > > The loops are not vectorized. Which is ok, because this device doesn’t have > the support for it. > > I just don’t think a pass could single handedly make code slower that much. > > > > Loop interchange is supposed to interchange the loop nest index with outer > index to improve cache locality. This is supposed to help -that is the next > iteration we will have the data available in cache. > > > > The benchmark source –and the loop that gets interchanged is line 143 > > > > Source: > https://github.com/embench/embench-iot/blob/master/src/matmult-int/matmult-int.c#L143 Looks like the classical matmul loop, similar to the one in SPEC CPU bwaves. We do apply interchange here and that looks reasonable to me. Note interchange assumes a CPU uarch with caches and HW prefetching where linear accesses are a lot more efficient than strided ones - that might not hold at all for the Cortex-M7. Without interchange the store to Res[] can be moved out of the inner loop. I've tried #define UPPERLIMIT 20 typedef long matrix[UPPERLIMIT][UPPERLIMIT]; void Multiply (matrix A, matrix B, long * __restrict Res) { register int Outer, Inner, Index; for (Outer = 0; Outer < UPPERLIMIT; Outer++) for (Inner = 0; Inner < UPPERLIMIT; Inner++) { (*(matrix *)Res)[Outer][Inner] = 0; for (Index = 0; Index < UPPERLIMIT; Index++) (*(matrix *)Res)[Outer][Inner] += A[Outer][Index] * B[Index][Inner]; } } and this is interchanged on x86_64 as well. We are implementing a trick for the zeroing which, when moved into innermost position is done as for (Index = 0; Index < UPPERLIMIT; Index++) for (Inner = 0; Inner < UPPERLIMIT; Inner++) { tem = Index == 0 ? 0 : (*(matrix *)Res)[Outer][Inner]; tem += A[Outer][Index] * B[Index][Inner]; (*(matrix *)Res)[Outer][Inner] = tem; } this conditional might kill performance for you. The advantage is that this loop can now be more efficiently vectorized. > > > This loop is where most of the time is spent. But it would have been good if > I had access to h/w tracing to see if the interchanged loop reduces cache > misses as well as to see what is causing it to run this much slower. > > > > Thanks for your reply! > > > > From: Richard Biener > Date: Thursday, February 13, 2025 at 2:57 AM > To: Visda Vokhshoori - C51841 > Cc: gcc@gcc.gnu.org > Subject: Re: 22% degradation seen in embench:matmult-int > > [You don't often get email from richard.guent...@gmail.com. Learn why this is > important at https://aka.ms/LearnAboutSenderIdentification ] > > EXTERNAL EMAIL: Do not click links or open attachments unless you know the > content is safe > > On Wed, Feb 12, 2025 at 4:38 PM Visda.Vokhshoori--- via Gcc > wrote: > > > > Embench is used for benchmarking on embedded devices. > > This one project matmult-int has a function Multiply. It’s a matrix > > multiplication for 20 x 20 matrix. > > The device is a ATSAME70Q21B which is Cortex-M7 > > The compiler is arm branch based on GCC version 13 > > We are compiling with O3 which has loop-interchange pass on by default. > > > > When we compile with -fno-loop-interchange we get all 22% back plus 5% > > speed up. > > > > When we do the loop interchange on the one loop nest that get interchanged > > it is slightly (.7%) faster. > > > > Has anyone else seen large degradation as a result of loop interchange? > > I would suggest to compare the -fopt-info diagnostic output with and > without -fno-loop-interchange, > the interchanged loop might for example no longer vectorize. Other > than that - no, loop interchange > isn't applied very often and it has a very conservative cost model. > > Are you able to share a testcase? > > Richard. > > > > > Thanks