Re: Rant about ChangeLog entries and commit messages
On Sun, 2007-12-02 at 21:36 +0100, Eric Botcazou wrote: > > I'd go even further, and say if the GNU coding standards say we > > shouldn't be putting descriptions of why we are changing things in the > > ChangeLog, than they should be changed and should be ignored on this > > point until they do. Pointing to them as the if they are The One True > > Way seems very suspect to me. After all, how else would they ever > > improve if nobody tries anything different? > > The people who wrote them presumably thought about these issues, too. > Unfortunately they didn't document the "why" just the "what"! Tim Josling
Re: Rant about ChangeLog entries and commit messages
On Sun, 2007-12-02 at 09:26 -0500, Robert Kiesling wrote: > > I guess nobody really loves writing ChangeLog entries, but in my opinion > > there > > are quite effective "executive summaries" for the patches and helpful to > > the > > reader/reviewer. Please let's not throw the baby with the bath's water. > > If there's a mechanism to filter checkin messages to ChangeLog summaries, > I would be happy to use it - in cases of multiple packages, especially, it's > important to know what changes were made, when, and when the changes > propagated > through packages and releases, and where they got to, occasionally. Anybody > know of a useful, built-in mechanism for this task? > Personally I find it slow and inefficient tracing through why a given change was made. It is just a slow process searching and sometimes I don't bother because it is so inconvenient. The ChangeLog entries provide little help and there does not seem to be a good alternative. If there is a good alternative no-one has said what it is so far. As people have pointed out, the RCSs pretty well cover the "what" these days. And writing changelog entries, which largely duplicate this information, is time-consuming and tedious. And there are of of little to no value to me at least. The coding standards do allow, in some cases, that giving some context would be useful: > See also what the GNU Coding Standards have to say about what goes in > ChangeLogs; in particular, descriptions of the purpose of code and > changes should go in comments rather than the ChangeLog, though a > single line overall description of the changes may be useful above the > ChangeLog entry for a large batch of changes. I personally would strongly favour each ChangeLog entry having a single line of context. This could be the PR number or a single line giving the purpose of the change or what bigger change it is part of. As pointed out by Zach Weinberg in his paper "A Maintenance Programmer's View of GCC", there are many impediments to contributing to GCC. http://www.linux.org.uk/~ajh/gcc/gccsummit-2003-proceedings.pdf Things are not much better than they were when Zach wrote his paper. This small change would be one positive step n the right direction, IMHO. Tim Josling
Re: Build failure in dwarf2out
Paul Thomas wrote: I am being hit by this: rf2out.c -o dwarf2out.o ../../trunk/gcc/dwarf2out.c: In function `file_name_acquire': ../../trunk/gcc/dwarf2out.c:7672: error: `files' undeclared (first use in this f unction) ../../trunk/gcc/dwarf2out.c:7672: error: (Each undeclared identifier is reported only once ../../trunk/gcc/dwarf2out.c:7672: error: for each function it appears in.) ../../trunk/gcc/dwarf2out.c:7672: error: `i' undeclared (first use in this funct ion) My guess is that the #define activating that region of code is erroneously triggered. I am running the 2-day (on cygwin with a substandard BIOS) testsuite now.
Re: Call to arms: testsuite failures on various targets
FX Coudert wrote: Hi all, I reviewed this afternoon the postings from the gcc-testresults mailing-list for the past month, and we have a couple of gfortran testsuite failures showing up on various targets. Could people with access to said targets (possibly maintainers) please file PRs in bugzilla for each testcase, reporting the error message and/or backtrace? (I'd be happy to be added to the Cc list of these) * ia64-suse-linux-gnu: gfortran.dg/vect/vect-4.f90 FAIL: gfortran.dg/vect/vect-4.f90 -O scan-tree-dump-times Alignment of access forced using peeling 1 FAIL: gfortran.dg/vect/vect-4.f90 -O scan-tree-dump-times Vectorizing an unali gned access 1 This happens on all reported ia64 targets, including mine. What is expected here? There is no vectorization on ia64, no reason for peeling. The compilation has no problem, and there is no report generated. As far as I know, the vectorization options are ignored. Without unrolling, of course, gfortran doesn't optimize the loop at all, but I assume that's a different question.
Re: Call to arms: testsuite failures on various targets
-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-iv-4.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-iv-9.c scan-tree-dump-times vectorized 1 loops 2 FAIL: gcc.dg/vect/vect-reduc-dot-s16b.c scan-tree-dump-times vectorized 1 loops 0 FAIL: gcc.dg/vect/vect-reduc-dot-u16b.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/vect-reduc-pattern-1a.c scan-tree-dump-times vectorized 1 loops 0 FAIL: gcc.dg/vect/vect-reduc-pattern-1c.c scan-tree-dump-times vectorized 1 loops 0 FAIL: gcc.dg/vect/vect-reduc-pattern-2a.c scan-tree-dump-times vectorized 1 loops 0 FAIL: gcc.dg/vect/vect-widen-mult-u16.c scan-tree-dump-times vectorized 1 loops 1 FAIL: gcc.dg/vect/wrapv-vect-reduc-pattern-2c.c scan-tree-dump-times vectorized 1 loops 0 FAIL: gcc.dg/vect/no-section-anchors-vect-69.c scan-tree-dump-times Alignment of access forced using peeling 3 === gcc Summary === # of expected passes42231 # of unexpected failures23 # of unexpected successes 2 # of expected failures 155 # of unresolved testcases 2 # of untested testcases 28 # of unsupported tests 374 /home/tim/src/gcc-4.3-20070413/ia64/gcc/xgcc version 4.3.0 20070413 (experimental) === gfortran tests === Running target unix === gfortran Summary === # of expected passes17438 # of expected failures 13 # of unsupported tests 20 /home/tim/src/gcc-4.3-20070413/ia64/gcc/testsuite/gfortran/../../gfortran version 4.3.0 20070413 (experimental) === g++ tests === Running target unix FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct call.* AA transformation on insn FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct call.* AA transformation on insn FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct call.* AA transformation on insn FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct call.* AA transformation on insn FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct call.* AA transformation on insn FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct call.* AA transformation on insn FAIL: g++.dg/tree-prof/indir-call-prof.C scan-tree-dump Indirect call -> direct call.* AA transformation on insn === g++ Summary === # of expected passes13739 # of unexpected failures7 # of expected failures 79 # of unsupported tests 119 /home/tim/src/gcc-4.3-20070413/ia64/gcc/testsuite/g++/../../g++ version 4.3.0 20070413 (experimental) === objc tests === Running target unix === objc Summary === # of expected passes1810 # of expected failures 7 # of unsupported tests 25 /home/tim/src/gcc-4.3-20070413/ia64/gcc/xgcc version 4.3.0 20070413 (experimental) === libgomp tests === Running target unix === libgomp Summary === # of expected passes1566 === libstdc++ tests === Running target unix XPASS: 26_numerics/headers/cmath/c99_classification_macros_c.cc (test for excess errors) XPASS: 27_io/fpos/14320-1.cc execution test === libstdc++ Summary === # of expected passes4859 # of unexpected successes 2 # of expected failures 27 Compiler version: 4.3.0 20070413 (experimental) Platform: ia64-unknown-linux-gnu configure flags: --enable-languages='c c++ fortran objc' --enable-bootstrap --enable-maintainer-mode --disable-libmudflap --prefix=/usr/local/gcc43 EOF Mail -s "Results for 4.3.0 20070413 (experimental) testsuite on ia64-unknown-linux-gnu" [EMAIL PROTECTED] &&
Re: Where is gstdint.h
[EMAIL PROTECTED] wrote: Where is gstdint.h ? Does it acctually exist ? libdecnumber seems to use it. decimal32|64|128.h's include decNumber.h which includes deccontext.h which includes gstdint.h When you configure libdecnumber (e.g. by running top-level gcc configure), gstdint.h should be created, by modifying . Since you said nothing about the conditions where you had a problem, you can't expect anyone to fix it for you. If you do want it fixed, you should at least file a complete PR. As it is more likely to happen with a poorly supported target, you may have to look into it in more detail than that. When this happened to me, I simply made a copy of stdint.h to get over the hump.
Re: Where is gstdint.h
[EMAIL PROTECTED] wrote: [EMAIL PROTECTED] wrote: Where is gstdint.h ? Does it acctually exist ? libdecnumber seems to use it. decimal32|64|128.h's include decNumber.h which includes deccontext.h which includes gstdint.h When you configure libdecnumber (e.g. by running top-level gcc configure), gstdint.h should be created, by modifying . Since you said nothing about the conditions where you had a problem, you can't expect anyone to fix it for you. If you do want it fixed, you should at least file a complete PR. As it is more likely to happen with a poorly supported target, you may have to look into it in more detail than that. When this happened to me, I simply made a copy of stdint.h to get over the hump. Thanks for prompt reply. I am doing 386 build. I could not find it in my build directory, but it is there after all. Sorry, not used to finding files in Linux. Aaron You can't expect people to guess which 386 build you are doing. Certain 386 builds clearly are not in the "poorly supported" category, others may be.
Re: Where is gstdint.h
[EMAIL PROTECTED] wrote: Tim Prince wrote: [EMAIL PROTECTED] wrote: Where is gstdint.h ? Does it acctually exist ? libdecnumber seems to use it. decimal32|64|128.h's include decNumber.h which includes deccontext.h which includes gstdint.h When you configure libdecnumber (e.g. by running top-level gcc configure), gstdint.h should be created, by modifying . Since you said nothing about the conditions where you had a problem, you can't expect anyone to fix it for you. If you do want it fixed, you should at least file a complete PR. As it is more likely to happen with a poorly supported target, you may have to look into it in more detail than that. When this happened to me, I simply made a copy of stdint.h to get over the hump. This might happen when you run the top level gcc configure in its own directory. You may want to try to make a new directory elsewhere and run configure there. pwd .../my-gcc-source-tree mkdir ../build cd ../build ../my-gcc-source-tree/configure make If you're suggesting trying to build in the top level directory to see if the same problem occurs, I would expect other problems to arise. If it would help diagnose the problem, and the problem persists for a few weeks, I'd be willing to try it.
Re: Effects of newly introduced -mpcX 80387 precision flag
[EMAIL PROTECTED] wrote: I just (re-)discovered these tables giving maximum known errors in some libm functions when extended precision is enabled: http://people.inf.ethz.ch/gonnet/FPAccuracy/linux/summary.html and when the precision of the mantissa is set to 53 bits (double precision): http://people.inf.ethz.ch/gonnet/FPAccuracy/linux64/summary.html This is from 2002, and indeed, some of the errors in double-precision results are hundreds or thousands of times bigger when the precision is set to 53 bits. This isn't very helpful. I can't find an indication of whose libm is being tested, it appears to be an unspecified non-standard version of gcc, and a lot of digging would be needed to find out what the tests are. It makes no sense at all for sqrt() to break down with change in precision mode. Extended precision typically gives a significant improvement in accuracy of complex math functions, as shown in the Celefunt suite from TOMS. The functions shown, if properly coded for SSE2, should be capable of giving good results, independent of x87 precision mode. I understand there is continuing academic research. Arguments have been going on for some time on whether to accept approximate SSE2 math libraries. I personally would not like to see new libraries without some requirement for readable C source and testing. I agree that it would be bad to set 53-bit mode blindly for a library which expects 64-bit mode, but it seems a serious weakness if such a library doesn't take care of precision mode itself. The whole precision mode issue seems somewhat moot, now that years have passed since the last CPUs were made which do not support SSE2, or the equivalent in other CPU families.
Re: Effects of newly introduced -mpcX 80387 precision flag
[EMAIL PROTECTED] wrote: On Apr 29, 2007, at 1:01 PM, Tim Prince wrote: It makes no sense at all for sqrt() to break down with change in precision mode. If you do an extended-precision (80-bit) sqrt and then round the result again to a double (64-bit) then those two roundings will increase the error, sometimes to > 1/2 ulp. To give current results on a machine I have access to, I ran the tests there on vendor_id : AuthenticAMD cpu family : 15 model : 33 model name : Dual Core AMD Opteron(tm) Processor 875 using euler-59% gcc -v Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../configure --prefix=/pkgs/gcc-4.1.2 Thread model: posix gcc version 4.1.2 on an up-to-date RHEL 4.0 server (so whatever libm is offered there), and, indeed, the only differences that it found were in 1/x, sqrt(x), and Pi*x because of double rounding. In other words, the code that went through libm gave identical answers whether running on sse, x87 (extended precision), or x87 (double precision). I don't know whether there are still math libraries for which Gonnet's 2002 results prevail. Double rounding ought to be avoided by -mfpmath=sse and permitting builtin_sqrt to do its thing, or by setting 53-bit precision. The latter disables long double. The original URL showed total failure of sqrt(); double rounding only brings error of .5 ULP, as usually assessed. I don't think the 64-/53-bit double rounding of sqrt can be detected, but of course such double rounding of * can be measured. With Pi, you have various possibilities, according to precision of the Pi value (including the possibility of the one supplied by the x87 instruction) as well as the 2 choices of arithmetic precision mode.
Re: Successfull Build of gcc on Cygwin WinXp SP2
[EMAIL PROTECTED] wrote: Cygcheck version 1.90 Compiled on Jan 31 2007 How do I get a later version of Cygwin ? 1.90 is the current release version. It seems unlikely that later trial versions have a patch for the stdio.h conflict with C99, or changes headers to avoid warnings which by default are fatal. If you want a newer cygwin.dll, read the cygwin mail list archive for hints, but it doesn't appear to be relevant.
Re: Successfull Build of gcc on Cygwin WinXp SP2
[EMAIL PROTECTED] wrote: James, On 5/1/07, Aaron Gray <[EMAIL PROTECTED]> wrote: Hi James, > Successfully built latest gcc on Win XP SP2 with cvs built cygwin. I was wondering whether you could help to get me to the same point please. You will need to use Dave Korns patch for newlib. http://sourceware.org/ml/newlib/2007/msg00292.html I am getting the following :- $ patch newlib/libc/include/stdio.h fix-gcc-bootstrap-on-cygwin-patch.diff patching file newlib/libc/include/stdio.h Hunk #1 succeeded at 475 (offset 78 lines). Hunk #2 FAILED at 501. Hunk #3 FAILED at 521. 2 out of 3 hunks FAILED -- saving rejects to file newlib/libc/include/stdio.h.rej I had to apply the relevant changes manually to the cygwin . It doesn't appear to match the version for which Dave made the patch.
Auslaender bevorzugt
Lese selbst: http://www.npd.de/npd_info/deutschland/2005/d0305-14.html Jetzt weiss man auch, wie es dazu kommt, dass Drogen, Waffen & Handy's in die Haende der Knacki's gelangen!
Re: What happend to bootstrap-lean?
Gabriel Dos Reis wrote: Andrew Pinski <[EMAIL PROTECTED]> writes: | > | > On Fri, 16 Dec 2005, Paolo Bonzini wrote: | > > Yes. "make bubblestrap" is now called simply "make". | > | > Okay, how is "make bootstrap-lean" called these days? ;-) | > | > In fact, bootstrap-lean is still documented in install.texi and | > makefile.texi, but it no longer seems to be present in the Makefile | > machinery. Could we get this back? | | bootstrap-lean is done by doing the following (which I feel is the wrong way): | Configure with --enable-bootstrap=lean | and then do a "make bootstrap" Hmm, does that mean that I would have to reconfigure GCC if I wanted to do "make bootstrap-lean" after a previous configuration and build? I think the answer must be "no", but I'm not sure. -- Gaby I've not been able to find another way to rebuild (on SuSE 9.2, for example) after applying the weekly patch file. I'm hoping that suggestion works.
Re: Fwd: Windows support dropped from gcc trunk
On 10/14/2015 11:36 AM, Steve Kargl wrote: > On Wed, Oct 14, 2015 at 11:32:52AM -0400, Tim Prince wrote: >> Sorry if someone sees this multiple times; I think it may have been >> stopped by ISP or text mode filtering: >> >> Since Sept. 26, the partial support for Windows 64-bit has been dropped >> from gcc trunk: >> winnt.c apparently has problems with seh, which prevent bootstrapping, >> and prevent the new gcc from building libraries. >> libgfortran build throws a fatal error on account of lack of support for >> __float128, even if a working gcc is used. >> I didn't see any notification about this; maybe it wasn't a consensus >> decision? >> There are satisfactory pre-built gfortran 5.2 compilers (including >> libgomp, although that is off by default and the testsuite wants acc as >> well as OpenMP) available in cygwin64 (test version) and (apparently) >> mingw-64. >> > The last comment to winnt.c is > > 2015-10-02 Kai Tietz > > PR target/51726 > * config/i386/winnt.c (ix86_handle_selectany_attribute): Handle > selectany within this function without need to keep attribute. > (i386_pe_encode_section_info): Remove selectany-code. > > Perhaps, contact Kai. > > I added gcc@gcc.gnu.org as this technically isn't a Fortran issue. test suite reports hundred of new ICE instances, all referring to this seh_unwind_emit function: /cygdrive/c/users/tim/tim/tim/src/gnu/gcc1/gcc/testsuite/gcc.c-torture/compile/2127-1.c: In function 'foo':^M /cygdrive/c/users/tim/tim/tim/src/gnu/gcc1/gcc/testsuite/gcc.c-torture/compile/2127-1.c:7:1: internal compiler error: in i386_pe_seh_unwind_emit, at config/i386/winnt.c:1137^M Please submit a full bug report,^M I will file a bugzila if that is what is wanted, but I wanted to know if there is a new configure option required. As far as I know there were always problems with long double for Windows targets, but the refusal of libgfortran to build on account of it is new. Thanks, Tim
New CA mirror
Hey, We have added a new mirror in Canada. IP address is being geolocated in the US but it is actually Canadian. If it has to be listed as a US mirror please let me know. Could you please add it to the list? --- Canada, Quebec: http://ca.mirror.babylon.network/gcc/ | ftp://ca.mirror.babylon.network/gcc/ | rsync://ca.mirror.babylon.network/gcc/, thanks to Tim Semeijn (noc@babylon.network) at Babylon Network. --- Thanks in advance! -- Tim Semeijn Babylon Network PGP: 0x2A540FA5 / 3DF3 13FA 4B60 E48A E755 9663 B187 0310 2A54 0FA5 signature.asc Description: OpenPGP digital signature
Maintenance ca.mirror.babylon.network
Dear, The storage of ca.mirror.babylon.network is not functioning properly and we will rebuild the storage platform in the following days. The mirror might become temporarily unavailable during this process. Once the storage has been rebuild I will inform you straight away. I hope to have informed you sufficiently and if you have any questions please let me know. Best regards, -- Tim Semeijn Babylon Network PGP: 0x2A540FA5 / 3DF3 13FA 4B60 E48A E755 9663 B187 0310 2A54 0FA5 signature.asc Description: OpenPGP digital signature
Re: question about -ffast-math implementation
On 6/2/2014 3:00 AM, Andrew Pinski wrote: On Sun, Jun 1, 2014 at 11:09 PM, Janne Blomqvist wrote: On Sun, Jun 1, 2014 at 9:52 AM, Mike Izbicki wrote: I'm trying to copy gcc's behavior with the -ffast-math compiler flag into haskell's ghc compiler. The only documentation I can find about it is at: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html I understand how floating point operations work and have come up with a reasonable list of optimizations to perform. But I doubt it is exhaustive. My question is: where can I find all the gory details about what gcc will do with this flag? I'm perfectly willing to look at source code if that's what it takes. In addition to the official documentation, a nice overview is at https://gcc.gnu.org/wiki/FloatingPointMath Useful, thanks for the pointer Though for the gory details and authoritative answers I suppose you'd have to look into the source code. Also, are there any optimizations that you wish -ffast-math could perform, but for various architectural reasons they don't fit into gcc? There are of course a (nearly endless?) list of optimizations that could be done but aren't (lack of manpower, impractical, whatnot). I'm not sure there are any interesting optimizations that would be dependent on loosening -ffast-math further? I find it difficult to remember how to reconcile differing treatments by gcc and gfortran under -ffast-math; in particular, with respect to -fprotect-parens and -freciprocal-math. The latter appears to comply with Fortran standard. (One thing I wish wouldn't be included in -ffast-math is -fcx-limited-range; the naive complex division algorithm can easily lead to comically poor results.) Which is kinda interesting because the Google folks have been trying to turn on -fcx-limited-range for C++ a few times now. Intel tried to add -complex-limited-range as a default under -fp-model fast=1 but that was shown to be unsatisfactory. Now, with the introduction of omp simd directives and pragmas, we have disagreement among various compilers on the relative roles of the directives and the fast-math options. I've submitted PR60117 hoping to get some insight on whether omp simd should disable optimizations otherwise performed by -ffast-math. Intel made the directives over-ride the compiler line fast (or "no-fast") settings locally, so that complex-limited-range might be in effect inside the scope of the directive (no matter whether you want it). They made changes in the current beta compiler, so it's no longer practical to set standard-compliant options but discard them by pragma in individual for loops. -- Tim Prince
New French mirror
Hi, I have set up a French gcc mirror. It is located in Roubaix, France. It is reachable through http, ftp and rsync: http://mirror.bbln.nl/gcc ftp://mirror.bbln.nl/gcc rsync://mirror.bbln.nl/gcc This mirror is provided by BBLN. Could you add it to the mirrorlist? If you have any questions please let me know! Best regards, Tim Semeijn BBLN
Rearrangement mirror servers
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear GCC Mirror Admin, I have rearranged our mirror setup which means we offer additional mirror servers mirroring GCC. Could you please make the following changes: - - Remove the current 'mirror.bbln.org' entry as mirror - - Please add the following three mirrors: Located in Gravelines, France http://mirror-fr1.bbln.org/gcc https://mirror-fr1.bbln.org/gcc ftp://mirror-fr1.bbln.org/gcc rsync://mirror-fr1.bbln.org/gcc Located in Roubaix, France http://mirror-fr2.bbln.org/gcc https://mirror-fr2.bbln.org/gcc ftp://mirror-fr2.bbln.org/gcc rsync://mirror-fr2.bbln.org/gcc Located in Amsterdam, The Netherlands http://mirror-nl1.bbln.org/gcc https://mirror-nl1.bbln.org/gcc ftp://mirror-nl1.bbln.org/gcc rsync://mirror-nl1.bbln.org/gcc As contact for these mirrors you can list: BBLN (n...@bbln.org) Thanks in advance! - -- Tim Semeijn pgp 0x08CE9B4D -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAEBAgAGBQJUTqSFAAoJEB4F+FYIzptNRQ4H/imDU1bWiveW0tBkj+YsHS8P 2EXCwoMTLbzASdDZkyqP0dspp3AmvhuypHbmYnjAbbLBYHtUASCxdp0j43fDwPUZ 3V9xDKA8mEvkYWS1WLXbSXxJhjbbURDaj6vGNnl+xHtRpfS1Z4HQix64qXhKX/EE pmK/YkSXh9b1FXyjviMXzJcdK4ehAulUdOBz5n50mCgK50bpS/CGvUGQUM16rz3M 0J3SuGpbjhqNZ/KVJeQzS7O9e+/3aXNpxV9/qZaIDGm18Nu2hd+BuK+m5zO6m0XY WMaSiuMI9QmEVruaWnuamIl0NJ6nGmfmfF3g5xn29R0lvDWKaFfpzkXUndiTihc= =9hVd -END PGP SIGNATURE- 0x08CE9B4D.asc Description: application/pgp-keys
Partial inline on recursive functions?
Hi there, I found the C++11 code below: int Fib(int n) { if (n <= 1) return n; return [&] { return Fib(n-2) + Fib(n-1); }(); } is ~2x faster than normal one: int Fib(int n) { if (n <= 1) return n; return Fib(n-2) + Fib(n-1); } I tested them with "-std=c++11 -O3/-O2" using trunk and the first version is ~2x (1.618x in theory?) faster. However, the first version has larger binary size (101k compared to 3k, which is for the second version). Clang produces 4k for the first version (with similar speed improvement) though. My guess is that the first `if (n <= 1) return n;` is easier to inline into the caller side, since the returned expression is a call to a separated function. It's translated to something like (ignoring linkage difference): int foo(int n); int Fib(int n) { if (n <= 1) { return n; } return foo(n); } int foo(int n) { return Fib(n-2) + Fib(n-1); }; After inline optimizations, it's translated to: int foo(int n); int Fib(int n) { if (n <= 1) { return n; } return foo(n); } int foo(int n) { return (n-2<=1) ? n-2 : foo(n-2) + (n-1<=1) ? n-1 : foo(n-1); }; As a result, the maximum depth of the stack reduces by 1, since all boundary checkings (if (n <= 1) return n;) are done by the caller side, which may eliminate unnecessary function call overhead. To me the optimization should be: For a given recursive function A, split it into function B and C, so that A is equivalent to { B(); return C(); }, where B should be easy to inline (e.g. no recursive calls) and C may not. Is it possible/reasonable to do such an optimization? I hope it can help. :) Thanks! -- Regards, Tim Shen
Mirror Changes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hey, We have changed our company name, hostnames and contact information. Please remove the current BBLN mirror (mirror.bbln.org) and replace it with our three new ones: - --- http://mirror0.babylon.network/gcc/ https://mirror0.babylon.network/gcc/ ftp://mirror0.babylon.network/gcc/ rsync://mirror0.babylon.network/gcc/ Location: Gravelines, France Contact: Tim Semeijn (noc@babylon.network) at Babylon Network - --- http://mirror1.babylon.network/gcc/ https://mirror1.babylon.network/gcc/ ftp://mirror1.babylon.network/gcc/ rsync://mirror1.babylon.network/gcc/ Location: Roubaix, France Contact: Tim Semeijn (noc@babylon.network) at Babylon Network - --- http://mirror2.babylon.network/gcc/ https://mirror2.babylon.network/gcc/ ftp://mirror2.babylon.network/gcc/ rsync://mirror2.babylon.network/gcc/ Location: Amsterdam, The Netherlands Contact: Tim Semeijn (noc@babylon.network) at Babylon Network - --- I will also send this e-mail from noc@babylon.network to confirm the request. Thanks in advance! - -- Tim Semeijn pgp 0x08CE9B4D -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQEcBAEBAgAGBQJVMEW+AAoJEB4F+FYIzptNcH8H/RUVkmAsUcgNe82nQN+wU76K SJBDD85pf01BqcoSIRpDc8HsjSU7FxhrDmr0u9BXRlIE0XFvRws3YvQqMl3yhTf0 fqATCq/b2KEGnc/mElmn6hO8H2wRMsjh7xPkVq2ysYJoNdBiyZJbZrL6i4TxAUf2 dXFTR18t5VAEOWCREfnkE9dzr4WcFiyiIVP2BeK0086WMLhF++TezuC5pwhlt1nW /D+ncBKJG1bE8poDt6LeHjeqpL+3CdIFLq/NQdNUj5ETyWmbeDV0p7kDgydDLcrA KOyfokNhTSof5p1IkccAsHPVStyq8B9DvLTPtODpuvqYZ0Iut0FJ7a1sszRScaU= =XOBV -END PGP SIGNATURE-
Confirmation Mirror Changes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hey, [[[ This is a confirmation of the request sent from n...@bbln.org ]]] We have changed our company name, hostnames and contact information. Please remove the current BBLN mirror (mirror.bbln.org) and replace it with our three new ones: - --- http://mirror0.babylon.network/gcc/ https://mirror0.babylon.network/gcc/ ftp://mirror0.babylon.network/gcc/ rsync://mirror0.babylon.network/gcc/ Location: Gravelines, France Contact: Tim Semeijn (noc@babylon.network) at Babylon Network - --- http://mirror1.babylon.network/gcc/ https://mirror1.babylon.network/gcc/ ftp://mirror1.babylon.network/gcc/ rsync://mirror1.babylon.network/gcc/ Location: Roubaix, France Contact: Tim Semeijn (noc@babylon.network) at Babylon Network - --- http://mirror2.babylon.network/gcc/ https://mirror2.babylon.network/gcc/ ftp://mirror2.babylon.network/gcc/ rsync://mirror2.babylon.network/gcc/ Location: Amsterdam, The Netherlands Contact: Tim Semeijn (noc@babylon.network) at Babylon Network - --- Thanks in advance! - -- Tim Semeijn Babylon Network pgp 0x5B8A4DDF -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBAgAGBQJVMEXGAAoJEIZioqpbik3fcIUP/2Nqh4RB8GbHMu3YaXz48DPy BvBSlh5HRKIqwNnnT6dR0LPp/Se9bU8rUq86c/hfecr7kqldru1bY39aHTY/LjqE poI3ACELS9KeGqP/qvdX0jifaJTw2PWwaSsVnxOUzlpJlQP9AT3q9QvgTTuyxkWu TU8mdUAy8gfzvDQ2aD2s1sCFbic85DymLlKEKo2KH7r4xRFvNPY5dhTSGKbuOwtR DxdAoA7a8jJaD0/Ke6fgdCjq5g/O80kNkzW13F4ZHpQ38Kjs64MDB1FsVUm57SML 5KZ0liKkyHsJQaz6NA/oZkCJccaxN0LoFHbgXg3t49oVyquv3cw/lMxxgi7//HzS uw6iMVHvJ7dVuk3MnTaqRfufYeiRhe3gqZ1E/SDPJGgYchxvgpntK7UT7/+jJRll 8kAJ/yty7kgmC6i8ZMlFCpdgOR7tJ2vOJ8c5s+T0/NKbUXY5etazgHUXe7XhOz+T 7L9woHYQ/Q6q3+JQqEF3kmuZLA7fmr8KjJvGhpw35RIjJkeeKKW+QbAn4W91b8Bi yTG3/crmc2mqvYfM43zm3gzYxb7sv8d1ZIB2TYdYDnQiO5QhyQqHBZQuhcxYZ+PT FAkyl+StCMc7Ucn7kY2+eMQzuFLEst/4UIBewtTqnX70s8o3rYTPtMIgf1qTpEY/ ZKf9FOq7t5U8ZkMxO3yE =RjV6 -END PGP SIGNATURE-
Re: [wwwdocs] PATCH for Re: Confirmation Mirror Changes
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 Dear Gerald, Thanks for processing the patch! Best regards, On 4/23/15 11:49 PM, Gerald Pfeifer wrote: > On Fri, 17 Apr 2015, Tim Semeijn wrote: >> We have changed our company name, hostnames and contact >> information. Please remove the current BBLN mirror >> (mirror.bbln.org) and replace it with our three new ones: > > The patch below implements those changes: > > - Replace mirror.bbln.org by mirror1.babylon.network. - Add > mirror0.babylon.network, France, Gravelines. - Add > mirror2.babylon.network, The Netherlands, Amsterdam. > > Applied. > > If you have any further changes, suggestion a patch against > https://gcc.gnu.org/mirrors.html would be great. > > Gerald > > Index: mirrors.html > === > > RCS file: /cvs/gcc/wwwdocs/htdocs/mirrors.html,v > retrieving revision 1.229 diff -u -r1.229 mirrors.html --- > mirrors.html 7 Apr 2015 18:17:46 - 1.229 +++ mirrors.html 23 > Apr 2015 21:39:08 - @@ -19,11 +19,16 @@ Canada: href="http://gcc.skazkaforyou.com";>http://gcc.skazkaforyou.com, > thanks to Sergey Ivanov (mirrors at skazkaforyou.com) > France (no snapshots): href="ftp://ftp.lip6.fr/pub/gcc/";>ftp.lip6.fr, thanks to > ftpmaint at lip6.fr France, Brittany: href="ftp://ftp.irisa.fr/pub/mirrors/gcc.gnu.org/gcc/";>ftp.irisa.fr, > thanks to ftpmaint at irisa.fr +France, Gravelines: + href="http://mirror0.babylon.network/gcc/";>http://mirror0.babylon.netw ork/gcc/ > | + href="ftp://mirror0.babylon.network/gcc/";>ftp://mirror0.babylon.networ k/gcc/ > | + href="rsync://mirror0.babylon.network/gcc/">rsync://mirror0.babylon.ne twork/gcc/, > > + thanks to Tim Semeijn (noc@babylon.network) at Babylon Network. > France, Roubaix: - href="http://mirror.bbln.org/gcc/";>http://mirror.bbln.org/gcc/ > | - href="ftp://mirror.bbln.org/gcc";>ftp://mirror.bbln.org/gcc | - > href="rsync://mirror.bbln.org/gcc">rsync://mirror.bbln.org/gcc, > > - - thanks to Tim Semeijn (n...@bbln.org) and BBLN. > + href="http://mirror1.babylon.network/gcc/";>http://mirror1.babylon.netw ork/gcc/ > | + href="ftp://mirror1.babylon.network/gcc/";>ftp://mirror1.babylon.networ k/gcc/ > | + href="rsync://mirror1.babylon.network/gcc/">rsync://mirror1.babylon.ne twork/gcc/, > > + thanks to Tim Semeijn (noc@babylon.network) at Babylon Network. > France, Versailles: href="ftp://ftp.uvsq.fr/pub/gcc/";>ftp.uvsq.fr, thanks to > ftpmaint at uvsq.fr Germany, Berlin: href="ftp://ftp.fu-berlin.de/unix/languages/gcc/";>ftp.fu-berlin.de , > thanks to ftp at fu-berlin.de Germany: href="ftp://ftp.gwdg.de/pub/misc/gcc/";>ftp.gwdg.de, thanks to > emoenke at gwdg.de @@ -34,6 +39,11 @@ Japan: href="ftp://ftp.dti.ad.jp/pub/lang/gcc/";>ftp.dti.ad.jp, thanks > to IWAIZAKO Takahiro (ftp-admin at dti.ad.jp) Japan: href="http://ftp.tsukuba.wide.ad.jp/software/gcc/";>ftp.tsukuba.wide.ad .jp, > thanks to Kohei Takahashi (tsukuba-ftp-servers at > tsukuba.wide.ad.jp) Latvia, Riga: href="http://mirrors.webhostinggeeks.com/gcc/";>mirrors.webhostinggeeks .com/gcc/, > thanks to Igor (whg.igp at gmail.com) +The Netherlands, > Amsterdam: + href="http://mirror2.babylon.network/gcc/";>http://mirror2.babylon.netw ork/gcc/ > | + href="ftp://mirror2.babylon.network/gcc/";>ftp://mirror2.babylon.networ k/gcc/ > | + href="rsync://mirror2.babylon.network/gcc/">rsync://mirror2.babylon.ne twork/gcc/, > > + thanks to Tim Semeijn (noc@babylon.network) at Babylon Network. > The Netherlands, Nijmegen: href="ftp://ftp.nluug.nl/mirror/languages/gcc";>ftp.nluug.nl, > thanks to Jan Cristiaan van Winkel (jc at ATComputing.nl) > Russia: href="http://mirrors-ru.go-parts.com/gcc/";>http://mirrors-ru.go-parts. com/gcc > > - -- Tim Semeijn Babylon Network pgp 0x5B8A4DDF -BEGIN PGP SIGNATURE- Version: GnuPG/MacGPG2 v2.0.19 (Darwin) Comment: GPGTools - http://gpgtools.org iQIcBAEBCgAGBQJVOWv/AAoJEIZioqpbik3fS9kP/2D3QLnkjI8jo9Xokd29W1O7 D/c6IFNNDmJNqv9iYuleGjm3HjHxMtMZALkcZPGTi0EMUQiZAyqC7mJlNOA2Fft7 7bneKXJ2bLic3ksaSXT1X9iBzUDkgKXywDn1rsvm/UkVLDbw16fP49Bdnoq2hHR9 0LVnuZwGYHQz6gPiARCFahK6f+kgJVhp5pPWtAm+7KWRPZd0P8Li3plS+opYsUR6 rF57Web5r+g5pLCO1NBukirt6xkEAsM9D+EuzKvyqFyp0vVnBqgPWcB/7jw3nc8p k0wnYaexa6P7JTbEDgKvyUQPYqIuIDjaUvQdVnT37C2yaaJjW1axTSXnJN7FMxHx 0PXTHkrnjw4j0ClfSUz2/o0ujEb14hbt15BC9H4FdHa8vBL0rL6t4GONEl0Tsad/ JYyciDkJuQDDry6aL1V9EfOUSJrH/2NoJKAtYg5zuKhqWaABVQzWah9EGCqjIgZD zXBELDYIHZuGi7LpOTEoj3jADkEnDufXYBrC9rGNvlivMgH3t26MZe2zcI9isfuw InogYh35Xlu5t7468mw3LUxU58v4h6R5V+movPbmuCtuTl2ueLvnzTAEkk2KAOH8 HGwRI6ymI+D3Wc0ZUt21Qt75jjgQKp7yAsy7DO2Sy9Xhc2jJS0JFwBGZGPmYjKP+ vOZmdjbx4MVu1PDOgPQk =MoX3 -END PGP SIGNATURE-
add command line option to gcc
I have a use case where I would like gcc to accept -Kthread and act as if it was passed -pthread. So -Kthread would be a synonym for -pthread. I am having trouble figuring out how the option processing is handled. Possibly in gcc/gcc.c but I am stumped here. Any pointers would be welcome. Thanks. -- Tim RiceMultitalents t...@multitalents.net
Re: add command line option to gcc
On Fri, 6 Sep 2019, Jonathan Wakely wrote: > On Fri, 6 Sep 2019 at 04:26, Tim Rice wrote: > > > > > > I have a use case where I would like gcc to accept -Kthread > > and act as if it was passed -pthread. So -Kthread would > > be a synonym for -pthread. > > For a specific target, or universally? Likely only useful for UnixWare (and OpenServer 6). > > > I am having trouble figuring out how the option processing is handled. > > Possibly in gcc/gcc.c but I am stumped here. > > You could use "specs" to tell the driver to use -pthread when -Kthread > is given e.g. > > %{Kthread: -pthread} > > This can either be hardcoded into the 'gcc' driver program (which > would be done in gcc/gcc.c or in a per-target file under gcc/config) > or provided in a specs file with the -specs option (see the manual). Ok, I'll go down this path and see how it works out. Thanks. > The quick and dirty way to test that would be to dump the current > specs to a file with 'gcc -dumpspecs > kthread.spec' and then edit the > file so that everywhere you see %{pthread: xxx} you add %{Kthread: > xxx} to make it do the same thing. Then you can run gcc > -specs=kthread.spec -Kthread ... > -- Tim RiceMultitalents(707) 456-1146 t...@multitalents.net
Remove ca.mirror.babylon.network
We will soon decommission our Canadian mirror due to restructuring. Please remove the following server from the mirror list: ca.mirror.babylon.network/gcc Our French mirrors will remain active. Thanks! -- Tim Semeijn Babylon Network PGP: 0x2A540FA5 / 3DF3 13FA 4B60 E48A E755 9663 B187 0310 2A54 0FA5 signature.asc Description: OpenPGP digital signature
Remove *.mirror.babylon.network
Dear, For the foreseeable future we will not be able to provide our mirrors anymore. Could you please remove: nl.mirror.babylon.network fr.mirror.babylon.network Thanks! -- Tim Semeijn Babylon Network PGP: 0x2A540FA5 / 3DF3 13FA 4B60 E48A E755 9663 B187 0310 2A54 0FA5 signature.asc Description: OpenPGP digital signature
Re: Vector permutation only deals with # of vector elements same as mask?
On 2/11/2011 7:30 AM, Bingfeng Mei wrote: Thanks. Another question. Is there any plan to vectorize the loops like the following ones? for (i=127; i>=0; i--) { x[i] = y[i] + z[i]; } When I last tried, the Sun compilers could vectorize such loops efficiently (for fairly short loops), with appropriate data definitions. The Sun compilers didn't peel for alignment, to improve performance on longer loops, as gcc and others do. For a case with no data overlaps (float * __restrict__ x, ,y,z, or Fortran), loop reversal can do the job. gcc has some loop reversal machinery, but I haven't seen it used for vectorization. In a simple case like this, some might argue there's no reason to write a backward loop when it could easily be reversed in source code, and compilers have been seen to make mistakes in reversal. -- Tim Prince
Re: numerical results differ after irrelevant code change
On 5/8/2011 8:25 AM, Michael D. Berger wrote: -Original Message- From: Robert Dewar [mailto:de...@adacore.com] Sent: Sunday, May 08, 2011 11:13 To: Michael D. Berger Cc: gcc@gcc.gnu.org Subject: Re: numerical results differ after irrelevant code change [...] This kind of result is quite expected on an x86 using the old style (default) floating-point (becauae of extra precision in intermediate results). How does the extra precision lead to the variable result? Also, is there a way to prevent it? It is a pain in regression testing. If you don't need to support CPUs over 10 years old, consider -march=pentium4 -mfpmath=sse or use the 64-bit OS and gcc. Note the resemblance of your quoted differences to DBL_EPSILON from . That's 1 ULP relative to 1.0. I have a hard time imagining the nature of real applications which don't need to tolerate differences of 1 ULP. -- Tim Prince
ARM abort() core files
About 3 years ago (August, 2008) there was a discussion here about (not) getting a backtrace from abort(3) on ARM: http://gcc.gnu.org/ml/gcc/2008-08/msg00060.html That thread discussed why core files generated from a call to abort() do not have a stack to review and some possible approaches for modifying the compiler. I can find no resolution to that discussion nor any other discussion of the topic. Our toolchain vendor is currently only providing GCC 4.4 and I have verified the issue still exists. They informed me today that GCC 4.6 "may be better about it". Can someone confirm that a change has been made and where I can find more information about it? Thanks! -- .Tim Tim D. Hammer Software Developer Global Business & Services Group Xerox Corporation M/S 0111-01A 800 Phillips Road Webster, NY 14580 Phone: 585/427-1684 Fax: 585/231-5596 Mail: tim.ham...@xerox.com
Re: Profiling gcc itself
On 11/20/2011 11:10 AM, Basile Starynkevitch wrote: On Sun, 20 Nov 2011 03:43:20 -0800 Jeff Evarts wrote: I posted this question at irc://irc.oftc.net/#gcc and they suggested that I pose it here instead. I do some "large-ish" builds (linux, gcc itself, etc) on a too-regular basis, and I was wondering what could be done to speed things up. A little printf-style checking hints to me that I might be spending the majority of my time in CPP rather g++, gasm, ld, etc. Has anyone (ever, regularly, or recently) built gcc (g++, gcpp) with profiling turned on? Is it hard? Did you get good results? I'm not sure the question belongs to gcc@gcc.gnu.org, perhaps gcc-h...@gcc.gnu.org might be a better place. If you choose to follow such advice, explaining whether other facilities already in gcc, e.g. http://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html apply to your situation may be useful. -- Tim Prince
Re: C Compiler benchmark: gcc 4.6.3 vs. Intel v11 and others
On 1/19/2012 9:27 AM, willus.com wrote: On 1/19/2012 2:59 AM, Richard Guenther wrote: On Thu, Jan 19, 2012 at 7:37 AM, Marc Glisse wrote: On Wed, 18 Jan 2012, willus.com wrote: For those who might be interested, I've recently benchmarked gcc 4.6.3 (and 3.4.2) vs. Intel v11 and Microsoft (in Windows 7) here: http://willus.com/ccomp_benchmark2.shtml http://en.wikipedia.org/wiki/Microsoft_Windows_SDK#64-bit_development For the math functions, this is normally more a libc feature, so you might get very different results on different OS. Then again, by using -ffast-math, you allow the math functions to return any random value, so I can think of ways to make it even faster ;-) Also for math functions you can simply substitute the Intel compilers one (GCC uses the Microsoft ones) by linking against libimf. You can also make use of their vectorized variants from GCC by specifying -mveclibabi=svml and link against libimf (the GCC autovectorizer will then use the routines from the Intel compiler math library). That makes a huge difference for code using functions from math.h. Richard. -- Marc Glisse Thank you both for the tips. Are you certain that with the flags I used Intel doesn't completely in-line the math2.h functions at the compile stage? gcc? I take it to use libimf.a (legally) I would have to purchase the Intel compiler? In-line math functions, beyond what gcc does automatically (sqrt...) are possible only with x87 code; those aren't vectorizable nor remarkably fast, although quality can be made good (with care). As Richard said, the icc svml library is the one supporting the fast vector math functions. There is also an arch-consistency version of svml (different internal function names) which is not as fast but may give more accurate results or avoid platform-dependent bugs. Yes, the Intel library license makes restrictions on usage: http://software.intel.com/en-us/articles/faq-intel-parallel-composer-redistributable-package/?wapkw=%28redistributable+license%29 You might use it for personal purposes under terms of this linux license: http://software.intel.com/en-us/articles/Non-Commercial-license/?wapkw=%28non-commercial+license%29 It isn't supported in the gcc context. Needless to say, I don't speak for my employer. -- Tim Prince
Re: C Compiler benchmark: gcc 4.6.3 vs. Intel v11 and others
On 1/19/2012 9:24 PM, willus.com wrote: On 1/18/2012 10:37 PM, Marc Glisse wrote: On Wed, 18 Jan 2012, willus.com wrote: For those who might be interested, I've recently benchmarked gcc 4.6.3 (and 3.4.2) vs. Intel v11 and Microsoft (in Windows 7) here: http://willus.com/ccomp_benchmark2.shtml http://en.wikipedia.org/wiki/Microsoft_Windows_SDK#64-bit_development For the math functions, this is normally more a libc feature, so you might get very different results on different OS. Then again, by using -ffast-math, you allow the math functions to return any random value, so I can think of ways to make it even faster ;-) I use -ffast-math all the time and have always gotten virtually identical results to when I turn it off. The speed difference is important for me. The default for the Intel compiler is more aggressive than gcc -ffast-math -fno-cx-limited-range, as long as you don't use one of the old buggy mathinline.h header files. For a fair comparison, you need detailed attention to comparable options. If you don't set gcc -ffast-math, you will want icc -fp-model-source. It's good to have in mind what you want from the more aggressive options, e.g. auto-vectorization of sum reduction. If you do want gcc -fcx-limited range, icc spells it -complex-limited-range. -- Tim Prince
Re: weird optimization in sin+cos, x86 backend
On 02/05/2012 11:08 AM, James Courtier-Dutton wrote: Hi, I looked at this a bit closer. sin(1.0e22) is outside the +-2^63 range, so FPREM1 is used to bring it inside the range. So, I looked at FPREM1 a bit closer. #include #include int main (void) { long double x, r, m; x = 1.0e22; // x = 5.26300791462049950360708478127784;<- This is what the answer should be give or take 2PI. m = M_PIl * 2.0; r = remainderl(x, m); // Utilizes FPREM1 printf ("x = %.17Lf\n", x); printf ("m = %.17Lf\n", m); printf ("r = %.17Lf\n", r); return 1; } This outputs: x = 100.0 m = 6.28318530717958648 r = 2.66065232182161996 But, r should be 5.26300791462049950360708478127784... or -1.020177392559086973318201985281... according to wolfram alpha and most arbitrary maths libs I tried. I need to do a bit more digging, but this might point to a bug in the cpu instruction FPREM1 Kind Regards James As I recall, the remaindering instruction was documented as using a 66-bit rounded approximation fo PI, in case that is what you refer to. -- Tim Prince
How to figure out the gcc -dP output?
Hello there. I am trying to track down a problem with gcc 4.1 which has to do with inlining and templates on PowerPC. Is there any documentation I can look related to the output generated with -fdump? I am getting extraneous lwz (load word and zero extend) instructions inserted when calling various methods - after $toc (r2) has been switched to the destination method's global data, just before the method call with the bctrl instruction. This lwz instruction causes a crash on IBM AIX when 32-bit shared libraries are loaded non-contiguously in memory. It looks like various code blocks are not being combined correctly when code is inlined - the extra lwz is being left behind. I have figured out that turning off gcse optimizations will stop this behavior, but doing this causes a performance hit. I would prefer not to upgrade the compiler at this time. With the compiler dump using -fdump, I am looking for a better way to work around this problem. Tim Crook.
RE: How to figure out the gcc -dP output?
Thanks David. I thought -mmininal-toc might have been a better workaround as well :-) . Is there a Bugzilla number for this issue? -Original Message- From: David Edelsohn [mailto:dje@gmail.com] Sent: Tuesday, July 28, 2009 9:46 AM To: Tim Crook Subject: Re: How to figure out the gcc -dP output? Tim, I do not fully understand the complete explanation of the original problem. You mention extraneous lwz and TOC. I think you are referring to a bug in GCC 4.1 that incorrectly emitted loads after the TOC already had been changed for an indirect call. GCSE probably is producing code that requires a constant and GCC needs to place that constant in the TOC. The late creation of the TOC reference is not scheduled correctly. GCSE is an optimization. -mminimal-toc is an option to avoid TOC overflow. Both of these are work-arounds to the problem. Disabling GCSE probably will slow down the application. -mminimal-toc probably will have less of a performance impact. As I mentioned to Chris when I spoke with him last week, I would recomment upgrading to a newer version of GCC because GCC 4.1 no longer is maintained. Many bug fixes, such as one for the problem you are encountering, are incorporated into newer releases. David > I found a possible compiler workaround, compiling with -mminimal-toc. Would > I get better performance by using this, instead of turning off gcse? On Fri, Jul 24, 2009 at 4:34 PM, Tim Crook wrote: > Hello there. > > I am trying to track down a problem with gcc 4.1 which has to do with > inlining and templates on PowerPC. Is there any documentation I can look > related to the output generated with -fdump? I am getting extraneous lwz > (load word and zero extend) instructions inserted when calling various > methods - after $toc (r2) has been switched to the destination method's > global data, just before the method call with the bctrl instruction. This lwz > instruction causes a crash on IBM AIX when 32-bit shared libraries are loaded > non-contiguously in memory. It looks like various code blocks are not being > combined correctly when code is inlined - the extra lwz is being left behind. > > I have figured out that turning off gcse optimizations will stop this > behavior, but doing this causes a performance hit. I would prefer not to > upgrade the compiler at this time. With the compiler dump using -fdump, I am > looking for a better way to work around this problem. > > Tim Crook. >
Re: Failure building current 4.5 snapshot on Cygwin
Eric Niebler wrote: Angelo Graziosi wrote: Eric Niebler wrote: I am running into the same problem (cannnot build latest snapshot on cygwin). I have built and installed the latest binutils from head (see attached config.log for details). But still the build fails. Any help? This is strange! Recent snapshots (4.3, 4.4, 4.5) build OB both on Cygwin-1.5 and 1.7. In 1.5 I have build the same binutils of 1.7. I've attached objdir/intl/config.log. It says you have triggered cross compilation mode, without complete setup. Also, it says you are building in a directory below your source code directory, which I always used to do myself, but stopped on account of the number of times I've seen this criticized. The only new build-blocking problem I've run into in the last month is the unsupported autoconf test, which has a #FIXME comment. I had to comment it out.
Re: [4.4] Strange performance regression?
Joern Rennecke wrote: Quoting Mark Tall : Joern Rennecke wrote: But at any rate, the subject does not agree with the content of the original post. When we talk about a 'regression' in a particular gcc version, we generally mean that this version is in some way worse than a previous version of gcc. Didn't the original poster indicate that gcc 4.3 was faster than 4.4 ? In my book that is a regression. He also said that it was a different machine, Core 2 Q6600 vs some kind of Xeon Core 2 system with a total of eight cores. As different memory subsystems are likely to affect the code, it is not an established regression till he can reproduce a performance drop going from an older to a current compiler on the same or sufficiently similar machines, under comparable load conditions - which generally means that the machine must be idle apart from the benchmark. Ian's judgment in diverting to gcc-help was born out when it developed that -funroll-loops was wanted. This appeared to confirm his suggestion that it might have had to do with loop alignments. As long as everyone is editorializing, I'll venture say this case raises the suspicion that gcc might benefit from better default loop alignments, at least for that particular CPU. However, I've played a lot of games on Core i7 with varying unrolling etc. I find the behavior of current gcc entirely satisfactory, aside from the verbosity of the options required.
Re: Whole program optimization and functions-only-called-once.
Toon Moene wrote: Richard Guenther wrote: On Sun, Nov 15, 2009 at 8:07 AM, Toon Moene wrote: Steven Bosscher wrote: At least CPROP, LCM-PRE, and HOIST (i.e. all passes in gcse.c), and variable tracking. Are they covered by a --param ? At least that way I could teach them to go on indefinitely ... I think most of them are. Maybe we should diagnose the cases where we hit these limits. That would be a good idea. One other compiler I work with frequently (the Intel Fortran compiler) does just that. However, either it doesn't have or their marketing department doesn't want you to know about knobs to tweak these decisions :-) Both gfortran and ifort have a much longer list of adjustable limits on in-lining than most customers are willing to study or test.
Re: On the x86_64, does one have to zero a vector register before filling it completely ?
Toon Moene wrote: H.J. Lu wrote: On Sat, Nov 28, 2009 at 3:21 AM, Toon Moene wrote: L.S., Due to the discussion on register allocation, I went back to a hobby of mine: Studying the assembly output of the compiler. For this Fortran subroutine (note: unless otherwise told to the Fortran front end, reals are 32 bit floating point numbers): subroutine sum(a, b, c, n) integer i, n real a(n), b(n), c(n) do i = 1, n c(i) = a(i) + b(i) enddo end with -O3 -S (GCC: (GNU) 4.5.0 20091123), I get this (vectorized) loop: xorps %xmm2, %xmm2 .L6: movaps %xmm2, %xmm0 movaps %xmm2, %xmm1 movlps (%r9,%rax), %xmm0 movlps (%r8,%rax), %xmm1 movhps 8(%r9,%rax), %xmm0 movhps 8(%r8,%rax), %xmm1 incl%ecx addps %xmm1, %xmm0 movaps %xmm0, 0(%rbp,%rax) addq$16, %rax cmpl%ebx, %ecx jb .L6 I'm not a master of x86_64 assembly, but this strongly looks like %xmm{0,1} have to be zero'd (%xmm2 is set to zero by xor'ing it with itself), before they are completely filled with the mov{l,h}ps instructions ? I think it is used to avoid partial SSE register stall. You mean there's no movaps (%r9,%rax), %xmm0 (and mutatis mutandis for %xmm1) instruction (to copy 4*32 bits to the register) ? If you want those, you must request them with -mtune=barcelona.
Re: On the x86_64, does one have to zero a vector register before filling it completely ?
Richard Guenther wrote: On Sat, Nov 28, 2009 at 4:26 PM, Tim Prince wrote: Toon Moene wrote: H.J. Lu wrote: On Sat, Nov 28, 2009 at 3:21 AM, Toon Moene wrote: L.S., Due to the discussion on register allocation, I went back to a hobby of mine: Studying the assembly output of the compiler. For this Fortran subroutine (note: unless otherwise told to the Fortran front end, reals are 32 bit floating point numbers): subroutine sum(a, b, c, n) integer i, n real a(n), b(n), c(n) do i = 1, n c(i) = a(i) + b(i) enddo end with -O3 -S (GCC: (GNU) 4.5.0 20091123), I get this (vectorized) loop: xorps %xmm2, %xmm2 .L6: movaps %xmm2, %xmm0 movaps %xmm2, %xmm1 movlps (%r9,%rax), %xmm0 movlps (%r8,%rax), %xmm1 movhps 8(%r9,%rax), %xmm0 movhps 8(%r8,%rax), %xmm1 incl%ecx addps %xmm1, %xmm0 movaps %xmm0, 0(%rbp,%rax) addq$16, %rax cmpl%ebx, %ecx jb .L6 I'm not a master of x86_64 assembly, but this strongly looks like %xmm{0,1} have to be zero'd (%xmm2 is set to zero by xor'ing it with itself), before they are completely filled with the mov{l,h}ps instructions ? I think it is used to avoid partial SSE register stall. You mean there's no movaps (%r9,%rax), %xmm0 (and mutatis mutandis for %xmm1) instruction (to copy 4*32 bits to the register) ? If you want those, you must request them with -mtune=barcelona. Which would then get you movups (%r9,%rax), %xmm0 (unaligned move). generic tuning prefers the split moves, AMD Fam10 and above handle unaligned moves just fine. Correct, the movaps would have been used if alignment were recognized. The newer CPUs achieve full performance with movups. Do you consider Core i7/Nehalem as included in "AMD Fam10 and above?"
Re: On the x86_64, does one have to zero a vector register before filling it completely ?
Toon Moene wrote: Toon Moene wrote: Tim Prince wrote: > If you want those, you must request them with -mtune=barcelona. OK, so it is an alignment issue (with -mtune=barcelona): .L6: movups 0(%rbp,%rax), %xmm0 movups (%rbx,%rax), %xmm1 incl%ecx addps %xmm1, %xmm0 movaps %xmm0, (%r8,%rax) addq$16, %rax cmpl%r10d, %ecx jb .L6 Once this problem is solved (well, determined how it could be solved), we go on to the next, the extraneous induction variable %ecx. There are two ways to deal with it: 1. Eliminate it with respect to the other induction variable that counts in the same direction (upwards, with steps 16) and remember that induction variable's (%rax) limit. or: 2. Count %ecx down from %r10d to zero (which eliminates %r10d as a loop carried register). g77 avoided this by coding counted do loops with a separate loop counter counting down to zero - not so with gfortran (quoting): /* Translate the simple DO construct. This is where the loop variable has integer type and step +-1. We can't use this in the general case because integer overflow and floating point errors could give incorrect results. We translate a do loop from: DO dovar = from, to, step body END DO to: [Evaluate loop bounds and step] dovar = from; if ((step > 0) ? (dovar <= to) : (dovar => to)) { for (;;) { body; cycle_label: cond = (dovar == to); dovar += step; if (cond) goto end_label; } } end_label: This helps the optimizers by avoiding the extra induction variable used in the general case. */ So either we teach the Fortran front end this trick, or we teach the loop optimization the trick of flipping the sense of a (n otherwise unused) induction variable This would have paid off more frequently in i386 mode, where there is a possibility of integer register pressure in loops small enough for such an optimization to succeed. This seems to be among the types of optimizations envisioned for run-time binary interpretation systems.
Re: Graphite and Loop fusion.
Toon Moene wrote: REAL, ALLOCATABLE :: A(:,:), B(:,:), C(:,:), D(:,:), E(:,:), F(:,:) ! ... READ IN EXTEND OF ARRAYS ... READ*,N ! ... ALLOCATE ARRAYS ALLOCATE(A(N,N),B(N,N),C(N,N),D(N,N),E(N,N),F(N,N)) ! ... READ IN ARRAYS READ*,A,B C = A + B D = A * C E = B * EXP(D) F = C * LOG(E) where the four assignments all have the structure of loops like: DO I = 1, N DO J = 1, N X(J,I) = OP(A(J,I), B(J,I)) ENDDO ENDDO Obviously, this could benefit from loop fusion, by combining the four assignments in one loop. Provided that it were still possible to vectorize suitable portions, or N is known to be so large that cache locality outweighs vectorization. This raises the question of progress on vector math functions, as well as the one about relative alignments (or ignoring them in view of recent CPU designs).
GCC 4.3.3 Configure and Build for DDRescue
Hello, I'll begin by stating my knowledge of Unix is almost non-existent. Using the basic skills that I learned many years ago, I'm currently trying to rescue a near dead hard drive with DDRescue. First, I need to install a C++ compiler, which I have downloaded (v4.3.3) and unzipped to my Mac. I've read through the instructions to configure and build, but am unable to decipher what I need. Is there a command that will perform a very basic configure and build of GCC 4.3.3? Thanks much.
Re: Need an assembler consult!
FX wrote: Hi all, I have picked up what seems to be a simple patch from PR36399, but I don't know enough assembler to tell whether it's fixing it completely or not. The following function: #include __m128i r(__m128 d1, __m128 d2, __m128 d3, __m128i r, int t, __m128i s) {return r+s;} is compiled by Apple's GCC into: pushl %ebp movl%esp, %ebp subl$72, %esp movaps %xmm0, -24(%ebp) movaps %xmm1, -40(%ebp) movaps %xmm2, -56(%ebp) movdqa %xmm3, -72(%ebp) # movdqa 24(%ebp), %xmm0 # paddq -72(%ebp), %xmm0 # leave ret Instead of lines marked with #, FSF's GCC gives: movdqa 40(%ebp), %xmm1 movdqa 8(%ebp), %xmm0 paddq %xmm1, %xmm0 By fixing SSE_REGPARM_MAX in config/i386/i386.h (following Apple's compiler value), I get GCC now generates: movdqa %xmm3, -72(%ebp) movdqa 24(%ebp), %xmm0 movdqa -72(%ebp), %xmm1 paddq %xmm1, %xmm0 The first two lines are identical to Apple, but the last two don't. They seem OK to me, but I don't know enough assembler to be really sure. Could someone confirm the two are equivalent? Apparently the same as far as what is returned in xmm0.
Re: The "right way" to handle alignment of pointer targets in the compiler?
Benjamin Redelings I wrote: Hi, I have been playing with the GCC vectorizer and examining assembly code that is produced for dot products that are not for a fixed number of elements. (This comes up surprisingly often in scientific codes.) So far, the generated code is not faster than non-vectorized code, and I think that it is because I can't find a way to tell the compiler that the target of a double* is 16-byte aligned. From Pr 27827 - http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27827 : "I just quickly glanced at the code, and I see that it never uses "movapd" from memory, which is a key to getting decent performance." How many people would take advantage of special machinery for some old CPU, if that's your goal? simplifying your example to double f3(const double* p_, const double* q_,int n) { double sum = 0; for(int i=0; iOn CPUs introduced in the last 2 years, movupd should be as fast as movapd, and -mtune=barcelona should work well in general, not only in this example. The bigger difference in performance, for longer loops, would come with further batching of sums, favoring loop lengths of multiples of 4 (or 8, with unrolling). That alignment already favors a fairly long loop. As you're using C++, it seems you could have used inner_product() rather than writing out a function. My Core I7 showed matrix multiply 25x25 times 25x100 producing 17Gflops with gfortran in-line code. g++ produces about 80% of that.
Re: The "right way" to handle alignment of pointer targets in the compiler?
Benjamin Redelings I wrote: Thanks for the information! Here are several reasons (there are more) why gcc uses 64-bit loads by default: 1) For a single dot product, the rate of 64-bit data loads roughly balances the latency of adds to the same register. Parallel dot products (using 2 accumulators) would take advantage of faster 128-bit loads. 2) run-time checks to adjust alignment, if possible, don't pay off for loop counts < about 40. 3) several obsolete CPU architectures implemented 128-bit loads by pairs of 64-bit loads. 4) 64-bit loads were generally more efficient than movupd, prior to barcelona. In the case you quote, with parallel dot products, 128-bit loads would be required so as to show much performance gain over x87.
Re: adding -fnoalias ... would a patch be accepted ?
torbenh wrote: can you please explain, why you reject the idea of -fnoalias ? msvc has declspec(noalias) icc has -fnoalias msvc needs it because it doesn't implement restrict and supports violation of typed aliasing rules as a default. ICL needs it for msvc compatibility, but has better alternatives. gcc can't copy the worst features of msvc.
Re: speed of double-precision divide
Steve White wrote: I was under the misconception that each of these SSE operatons was meant to be accomplished in a single clock cycle (although I knew there are various other issues.) Current CPU architectures permit an SSE scalar or parallel multiply and add instruction to be issued on each clock cycle. Completion takes at least 4 cycles for add, significantly more for multiply. The instruction timing tables quote throughput (how many cycles between issue) and latency (number of cycles to complete an individual operation). An even more common misconception than yours is that the extra time taken to complete multiply, compared with the time of add, would disappear with fused multiply-add instructions. SSE divide, as has been explained, is not pipelined. The best way to speed up a loop with divide is with vectorization, barring situations such as the one you brought up where divide may not actually be a necessary part of the algorithm.
Re: Support for export keyword to use with C++ templates ?
On 2/2/10 7:19 PM, Richard Kenner wrote: I see that what I need is an assignment for all future changes. If my employer is not involved with any contributions of mine, the employer disclaimer is not needed, right ? It's safest to have it. The best way to prove that your employer is not involved with any contributions of yours is with such a disclaimer. Some employers have had a formal process for approving assignment of own-time contributions, as well as assignments as part of their business, and lack of either form of assignment indicates the employer has forbidden them. -- Tim Prince
Re: Starting an OpenMP parallel section is extremely slow on a hyper-threaded Nehalem
On 2/11/2010 2:00 AM, Edwin Bennink wrote: Dear gcc list, I noticed that starting an OpenMP parallel section takes a significant amount of time on Nehalem cpu's with hyper-threading enabled. If you think a question might be related to gcc, but don't know which forum to use, gcc-help is more appropriate. As your question is whether there is a way to avoid anomalous behaviors when an old Ubuntu is run on a CPU released after that version of Ubuntu, an Ubuntu forum might be more appropriate. A usual way is to shut off HyperThreading in the BIOS when running on a distro which has trouble with it. I do find your observation interesting. As far as I know, the oldest distro which works well on Core I7 is RHEL5.2 x86_64, which I run, with updated gcc and binutils, and HT disabled, as I never run applications which could benefit from HT. -- Tim Prince
Re: Change x86 default arch for 4.5?
On 2/18/2010 4:54 PM, Joe Buck wrote: But maybe I didn't ask the right question: can any x86 experts comment on recently made x86 CPUs that would not function correctly with code produced by --with-arch=i486? Are there any? All CPUs still in production are at least SSE3 capable, unless someone can come up with one of which I'm not aware. Intel compilers made the switch last year to requiring SSE2 capability for the host, as well as in the default target options, even for 32-bit. All x86_64 or X64 CPUs for which any compiler was produced had SSE2 capability, so it is required for those 64-bit targets. -- Tim Prince
Re: [RFH] A simple way to figure out the number of bits used by a long double
On 2/26/2010 5:44 AM, Ed Smith-Rowland wrote: Huh. I would have *sworn* that sizeof(long double) was 10 not 16 even though we know it was 80 bits. As you indicated before, sizeof gives the amount of memory displaced by the object, including padding. In my experience with gcc, sizeof(long double) is likely to be 12 on 32-bit platforms, and 16 on 64-bit platforms. These choices are made to preserve alignment for 32-bit and 128-bit objects respectively, and to improve performance in the 64-bit case, for hardware which doesn't like to straddle cache lines. It seems the topic would have been more appropriate for gcc-help, if related to gcc, or maybe comp.lang.c, if a question about implementation in accordance with standard C. -- Tim Prince
Re: legitimate parallel make check?
On 3/9/2010 4:28 AM, IainS wrote: It would be nice to allow the apparently independent targets [e.g. gcc-c,fortran,c++ etc.] to be (explicitly) make-checked in parallel. On certain targets, it has been necessary to do this explicitly for a long time, submitting make check-gcc, make check-fortran, make check-g++ separately. Perhaps a script could be made which would detect when the build is complete, then submit the separate make check serial jobs together. -- Tim Prince
Re: GCC vs ICC
On 3/22/2010 7:46 PM, Rayne wrote: Hi all, I'm interested in knowing how GCC differs from Intel's ICC in terms of the optimization levels and catering to specific processor architecture. I'm using GCC 4.1.2 20070626 and ICC v11.1 for Linux. How does ICC's optimization levels (O1 to O3) differ from GCC, if they differ at all? The ICC is able to cater specifically to different architectures (IA-32, intel64 and IA-64). I've read that GCC has the -march compiler option which I think is similar, but I can't find a list of the options to use. I'm using Intel Xeon X5570, which is 64-bit. Are there any other GCC compiler options I could use that would cater my applications for 64-bit Intel CPUs? Some of that seems more topical on the Intel software forum for icc, and the following more topical on either that forum or gcc-help, where you should go for follow-up. If you are using gcc on Xeon 5570, gcc -mtune=barcelona -ffast-math -O3 -msse4.2 might be a comparable level of optimization to icc -xSSE4.2 For gcc 4.1, you would have to set also -ftree-vectorize, but you would be better off with a current version. But, if you are optimizing for early Intel 64-bit Xeon, -mtune=barcelona would not be consistently good, and you could not use -msse4 or -xSSE4.2. For optimization which observes standards and also disables vectorized sum reduction, you would omit -ffast-math for gcc, and set icc -fp-model source. -- Tim Prince
Re: Compiler option for SSE4
On 3/23/2010 11:02 PM, Rayne wrote: I'm using GCC 4.1.2 20070626 on a server with Intel Xeon X5570. How do I turn on the compiler option for SSE4? I've tried -msse4, -msse4.1 and -msse4.2, but they all returned the error message cc1: error: unrecognized command line option "-msse4.1" (for whichever option I tried). You would need a gcc version which supports sse4. As you said yourself, your version is approaching 3 years old. Actually, the more important option for Xeon 55xx, if you are vectorizing, is the -mtune=barcelona, which has been supported for about 2 years. Whether vectorizing or not, on an 8 core CPU, the OpenMP introduced in gcc 4.2 would be useful. This looks like a gcc-help mail list question, which is where you should submit any follow-up. -- Tim Prince
Re: Optimizing floating point *(2^c) and /(2^c)
On 3/29/2010 10:51 AM, Geert Bosch wrote: On Mar 29, 2010, at 13:19, Jeroen Van Der Bossche wrote: 've recently written a program where taking the average of 2 floating point numbers was a real bottleneck. I've looked into the assembly generated by gcc -O3 and apparently gcc treats multiplication and division by a hard-coded 2 like any other multiplication with a constant. I think, however, that *(2^c) and /(2^c) for floating points, where the c is known at compile-time, should be able to be optimized with the following pseudo-code: e = exponent bits of the number if (e> c&& e< (0b111...11)-c) { e += c or e -= c } else { do regular multiplication } Even further optimizations may be possible, such as bitshifting the significand when e=0. However, that would require checking for a lot of special cases and require so many conditional jumps that it's most likely not going to be any faster. I'm not skilled enough with assembly to write this myself and test if this actually performs faster than how it's implemented now. Its performance will most likely also depend on the processor architecture, and I could only test this code on one machine. Therefore I ask to those who are familiar with gcc's optimization routines to give this 2 seconds of thought, as this is probably rather easy to implement and many programs could benefit from this. For any optimization suggestions, you should start with showing some real, compilable, code with a performance problem that you think the compiler could address. Please include details about compilation options, GCC versions and target hardware, as well as observed performance numbers. How do you see that averaging two floating point numbers is a bottleneck? This should only be a single addition and multiplication, and will execute in a nanosecond or so on a moderately modern system. Your particular suggestion is flawed. Floating-point multiplication is very fast on most targets. It is hard to see how on any target with floating-point hardware, manual mucking with the representation can be a win. In particular, your sketch doesn't at all address underflow and overflow. Likely a complete implementation would be many times slower than a floating-point multiply. -Geert gcc used to have the ability to replace division by a power of 2 by an fscale instruction, for appropriate targets (maybe still does). Such targets have nearly disappeared from everyday usage. What remains is the possibility of replacing the division by constant power of 2 by multiplication, but it's generally considered the programmer should have done that in the beginning. icc has such an facility, but it's subject to -fp-model=fast (equivalent to gcc -ffast-math -fno-cx-limited-range), even though it's a totally safe conversion. As Geert indicated, it's almost inconceivable that a correct implementation which takes care of exceptions could match the floating point hardware performance, even for a case which starts with operands in memory (but you mention the case following an addition). -- Tim Prince
Re: GCC primary/secondary platforms?
On 4/7/2010 9:17 AM, Gary Funck wrote: On 04/07/10 11:11:05, Diego Novillo wrote: Additionally, make sure that the branch bootstraps and tests on all primary/secondary platforms with all languages enabled. Diego, thanks for your prompt reply and suggestions. Regarding the primary/secondary platforms. Are those listed here? http://gcc.gnu.org/gcc-4.5/criteria.html Will there be a notification if and when C++ run-time will be ready to test on secondary platforms, or will platforms like cygwin be struck from the secondary list? I'm 26 hours into testsuite for 4.5 RC for cygwin gcc/gfortran, didn't know of any other supported languages worth testing. My ia64 box died a few months ago, but suse-linux surely was at least as popular as unknown-linux in recent years. -- Tim Prince
Re: GCC primary/secondary platforms?
On 4/8/2010 2:40 PM, Dave Korn wrote: On 07/04/2010 19:47, Tim Prince wrote: Will there be a notification if and when C++ run-time will be ready to test on secondary platforms, or will platforms like cygwin be struck from the secondary list? What exactly are you talking about? Libstdc++-v3 builds just fine on Cygwin. Our release criteria for the secondary platforms is: * The compiler bootstraps successfully, and the C++ runtime library builds. * The DejaGNU testsuite has been run, and a substantial majority of the tests pass. We pass both those criteria with flying colours. What are you worrying about? cheers, DaveK No one answered questions about why libstdc++ configure started complaining about mis-match in style of wchar support a month ago. Nor did I see anyone give any changes in configure procedure. Giving it another try at a new download today. -- Tim Prince
Re: GCC primary/secondary platforms?
On 4/8/2010 6:24 PM, Dave Korn wrote: Nor did I see anyone give any changes in configure procedure. Giving it another try at a new download today. Well, nothing has changed, but then again I haven't seen anyone else complaining about this, so there's probably some problem in your build environment; let's see what happens with your fresh build. (I've built the 4.5.0-RC1 candidate without any complications and am running the tests right now.) Built OK this time around, no changes here either, except for cygwin1 update. testsuite results in a couple of days. Thanks. -- Tim Prince
Re: Why not contribute? (to GCC)
On 4/23/2010 1:05 PM, HyperQuantum wrote: On Fri, Apr 23, 2010 at 9:58 PM, HyperQuantum wrote: On Fri, Apr 23, 2010 at 8:39 PM, Manuel López-Ibáñez wrote: What reasons keep you from contributing to GCC? The lack of time, for the most part. I submitted a feature request once. It's now four years old, still open, and the last message it received was two years ago. (PR26061) The average time for acceptance of a PR with a patch submission from an outsider such as ourselves is over 2 years, and by then the patch no longer fits, has to be reworked, and is about to become moot. I still have the FSF paperwork in force, as far as I know, from over a decade ago, prior to my current employment. Does it become valid again upon termination of employment? My current employer has no problem with the FSF paperwork for employees whose primary job is maintenance of gnu software (with committee approval), but this does not extend to those of us for whom it is a secondary role. There once was a survey requesting responses on how our FSF submissions compared before and after current employment began, but no summary of the results. -- Tim Prince
Re: Autovectorizing does not work with classes
Georg Martius wrote: > Dear gcc developers, > > I am new to this list. > I tried to use the auto-vectorization (4.2.1 (SUSE Linux)) but unfortunately > with limited success. > My code is bassically a matrix library in C++. The vectorizer does not like > the member variables. Consider this code compiled with > gcc -ftree-vectorize -msse2 -ftree-vectorizer-verbose=5 > -funsafe-math-optimizations > that gives basically "not vectorized: unhandled data-ref" > > class P{ > public: > P() : m(5),n(3) { > double *d = data; > for (int i=0; i d[i] = i/10.2; > } > void test(const double& sum); > private: > int m; > int n; > double data[15]; > }; > > void P::test(const double& sum) { > double *d = this->data; > for(int i=0; i d[i]+=sum; > } > } > > whereas the more or less equivalent C version works just fine: > > int m=5; > int n=3; > double data[15]; > > void test(const double& sum) { > int mn = m*n; > for(int i=0; i data[i]+=sum; > } > } > > > Is there a fundamental problem in using the vectorizer in C++? > I don't see any C code above. As another reply indicated, the most likely C idiom would be to pass sum by value. Alternatively, you could use a local copy of sum, in cases where that is a problem. The only fundamental vectorization problem I can think of which is specific to C++ is the lack of a standard restrict keyword. In g++, __restrict__ is available. A local copy (or value parameter) of sum avoids a need for the compiler to recognize const or restrict as an assurance of no value modification. The loop has to have known fixed bounds at entry, in order to vectorize. If your C++ style doesn't support that, e.g. by calculating the end value outside the loop, as you show in your latter version, then you do have a problem with vectorization.
Re: question. type long long
Александр Струняшев wrote: > Good afternoon. > I need some help. As from what versions your compiler understand that > "long long" is 64 bits ? > > Best regards, Alexander > > P.S. Sorry for my mistakes, I know English bad. No need to be sorry about English, but the topic is OK for gcc-help, not gcc development. gcc was among the first compilers to support long long (always as 64-bit), the only problem being that it was a gnu extension for g++. In that form, the usage may not have settled down until g++ 4.1. The warnings for attempting long long constants in 32-bit mode, without the LL suffix, have been a subject of discussion: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13358 The warning doesn't mean that long long could be less than 64 bits; it means the constant without the LL suffix is less than 64 bits.
Re: need to find functions definitions
On Tuesday 21 October 2008 20:07:10 `VL wrote: > Hello, ALL. > > I recently started to actively program using C and found that tools like > ctags or cscope do not work properly for big projects. Quite ofthen they > can't find function or symbol definition. The problem here is that they > don't use full code parsing, but just some sort of regular expressions. > > I need a tool for automatic code exploration that can at least find > definition for every symbol without problems. If you know any - please > tell. > > Now, to gcc. It seems to me that existing and working compiler is an ideal > place to embedd such tool - since it already knows all the things required. > > I have one idea: i'm almost sure that inside gcc somewhere there is a place > (function i beleive) that is called each time during compilation when > definition of something (type, variable,function...) found and in this > place gcc has context information - source file and line where this > definition came from. So if i add something like printf("DEFINITION: > %s,%s,%d\n", info->object_type,info->src_file,info->line) into that place, > i will get information about every thing that compiler found. > > What i like more about this way of getting information about symbol > definition is that i get the only reference to that part of source that was > actually compiled. I.e. if there are a lot of #ifdef's, it's is hard to > know what part of code will be used. > > So, my questions is: > > 1) Is it possible ? Is there a single place where all information is easily > accessible ? 2) If yes - where is it and where can i find details about > internals of gcc? 3) Any good alternatives for cscope/ctags? It seemed to > me that > eclipse has some good framework, but it looks to be too much integrated > with it... > > Thank you! Hi, wouldn't it be easier to just compile with debug symbols (-g) and then look into the symbol table or into the DWARF debug information? Both can be done with the tool objdump contained in the binutils (normally installed on each linux), and there are libraries for both tasks to read and use the information in own applications. You'll get symbol (functions/methods, arguments, variables) names, addresses, types, etc. Tim
Re: Backward Compatibility of RHEL Advanced Server and GCC
Steven Bosscher wrote: On Wed, Oct 29, 2008 at 6:19 AM, S. Suhasini <[EMAIL PROTECTED]> wrote: We would like to know whether the new version of the software (compiled with the new GCC) can be deployed and run on the older setup with RHEL AS 3 and GCC 2.96. We need not compile again on the older setup. Will there be any run-time libraries dependency? Would be very grateful if we get a response for this query. It seems to me that this kind of question is best asked on a RedHat support list, not on a list where compiler development is discussed. FWIW, there is no "official" GCC 2.96, see http://gcc.gnu.org/gcc-2.96.html. This might be partially topical on the gcc-help list. If dynamic libraries are in use, there will be trouble.
Re: change to gcc from lcc
On Friday 14 November 2008 10:09:22 Anna Sidera wrote: > Hello, > > The following code works in lcc in windows but it does not work in gcc in > unix. I think it is memory problem. In lcc there is an option to use more > temporary memory than the default. Is there something similar in gcc? > > #include > #include > #include > #include > int main() > { > int i, j; > int buffer1[250][100]; > for (i=0; i<250; i++) { > for (j=0; j<100; j++) { > buffer1[i][j]=0; > } > } > printf("\nThe program finished successfully\n"); > return 0; > } > > Many Thanks, > Anna Anna, the code you provided tries to allocate a huge chunk of memory on the stack. This is not the way things should be done. Even if the compiler allows for "using more temporary memory than the default", the solution is by no means portable. A way more elegant solution is to use memory on the heap: int main() { int i, j; int *buf = (int*) malloc (250 * 100 * sizeof(int)); for (i=0; i<250; i++) { for (j=0; j<100; j++) { buf[i][j]=0; } } free (buf); printf("\nYay! :D\n"); return 0; } Tim
Re: Cygwin support
Brian Dessent wrote: > Cygwin has been a secondary target for a number of years. MinGW has > been a secondary target since 4.3. This generally means that they > should be in fairly good shape, more or less. To quote the docs: > >> Our release criteria for the secondary platforms is: >> >> * The compiler bootstraps successfully, and the C++ runtime library >> builds. >> * The DejaGNU testsuite has been run, and a substantial majority of the >> tests pass. > > > More recently I've seen Danny Smith report that the IRA merge broke > MinGW (and presumably Cygwin, since they share most of the same code) > bootstrap. I haven't tested this myself recently so I don't know if > it's still broken or not. > I've run the bootstrap and testsuite twice in the last month. The bootstrap failures are due to a broken #ifdef specific to cygwin in the headers provided with cygwin, the requirement for a specific version of autoconf (not available in setup), and the need to remove the -werror in libstdc++ build (because of minor discrepancies in cygwin headers). All of those are easy to rectify, but fixes seem unlikely to be considered by the decision makers. However, the C++ testsuite results are unacceptable, with many internal errors. For some time now, gfortran has been broken for practical purposes, even when it passes testsuite, as it seems to have a memory leak. This shows up in the public wiki binaries. So, there are clear points for investigation of cygwin problems, and submission of PRs, should you be interested. > Running the dejagnu testsuite on Cygwin is > excruciatingly slow due to the penalty incurred from emulating fork. It runs over a weekend on a Pentium D which I brought back to life by replacing the CPU cooler system. I have no problem with running this if I am in the office when the snapshot is released, but I think there is little interest in fixing the problems which are specific to g++ on cygwin, yet working gcc and gfortran aren't sufficient for gcc upgrades to be accepted. Support for 64-bit native looks like it will be limited to mingw, so I no longer see a future for gcc on cygwin.
GCC 3.4.6 on x86_64: __builtin_frame_address(1) of topmost frame doesn't return 0x0
Hi, in binaries compiled with gcc 3.4.6 on an x86_64 machine, I get the following behaviour. I wrote a little testcase: int main(int argc, char **argv) { unsigned long addr; if ( (addr = (unsigned long)(__builtin_frame_address(0))) ) { printf ("0x%08lx\n", addr); if ( (addr = (unsigned long)(__builtin_frame_address(1))) ) { printf ("0x%08lx\n", addr); if ( (addr = (unsigned long)(__builtin_frame_address(2))) ) { printf ("0x%08lx\n", addr); // ... some more scopes ... } } } return 0; } This code is a bit ugly, I made it that way because of the part in gcc's manpages: "CC also has two builtins that can assist you, but which may or may not be implemented fully on your architecture, and those are __builtin_frame_address and __builtin_return_address. Both of which want an immediate integer level (by immediate, I mean it can't be a variable)." - but it doesn't change the outcome of the test, anyway. I ran the test on three machines with the following results: 1) [EMAIL PROTECTED] ~]$ uname -m i686 [EMAIL PROTECTED] ~]$ gcc -v Lese Spezifikationen von /usr/lib/gcc/i386-redhat-linux/3.4.6/specs Konfiguriert mit: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=i386-redhat-linux Thread-Modell: posix gcc-Version 3.4.6 20060404 (Red Hat 3.4.6-10) [EMAIL PROTECTED] ~]$ gcc -o test test.c && ./test 0xbfefc048 0xbfefc0a8 [EMAIL PROTECTED] ~]$ 2) [EMAIL PROTECTED] ~]$ uname -m x86_64 [EMAIL PROTECTED] ~]$ gcc -v Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --enable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux Thread model: posix gcc version 4.1.2 20070626 (Red Hat 4.1.2-14) [EMAIL PROTECTED] ~]$ gcc -o test test.c && ./test 0x7fffc400c8c0 [EMAIL PROTECTED] ~]$ 3) [EMAIL PROTECTED] ~]$ uname -m x86_64 [EMAIL PROTECTED] ~]$ gcc -v Reading specs from /usr/lib/gcc/x86_64-redhat-linux/3.4.6/specs Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --disable-checking --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-java-awt=gtk --host=x86_64-redhat-linux Thread model: posix gcc version 3.4.6 20060404 (Red Hat 3.4.6-10) [EMAIL PROTECTED] ~]$ gcc -o test test.c && ./test 0x7fb8c0 0x35952357a8 0x7fb9a8 0x7fbb7b 0x454d414e54534f48 Segmentation fault [EMAIL PROTECTED] ~]$ So, on the 32bit machine and on the 64bit machine running gcc 4.x, the end of the stack is found (__builtin_frame_address(n) returned 0x0). The output is a bit different, apparently on the 32bit machine, the stackframe of the caller of main() is also found, but that is not important for the error. On the 64bit machine using gcc 3.4.6 however, at some point, garbage (?) is returned. My questions now are, is this known behaviour / a known issue? I didn't find a fitting patch. Or, can the fix for it be backported from gcc 4.x to 3.4.x? I cannot switch to gcc 4.x for some other reasons. If all this doesn't result in a solution, is there maybe another way for me to determine which stackframe is the topmost one? (Should I just compare the function name with "main"? That'd be a bit dirty, wouldn't it?) Thanks, Tim München
Re: Purpose of GCC Stack Padding?
Andrew Tomazos wrote: I've been studying the x86 compiled form of the following function: void function() { char buffer[X]; } where X = 0, 1, 2 .. 100 Naively, I would expect to see: pushl %ebp movl%esp, %ebp subl$X, %esp leave ret Instead, the stack appears to be padded: For a buffer size of 0the stack size is 0 For a buffer size of 1 to 7 the stack size is 16 For a buffer size of 8 to 12 the stack size is 24 For a buffer size of 13 to 28 the stack size is 40 For a buffer size of 29 to 44 the stack size is 56 For a buffer size of 45 to 60 the stack size is 72 For a buffer size of 61 to 76 the stack size is 88 For a buffer size of 77 to 92 the stack size is 104 For a buffer size of 93 to 100 the stack size is 120 When X >= 8 gcc adds a stack corruption check (__stack_chk_fail), which accounts for an extra 4 bytes of stack space in these cases. This does not explain the rest of the padding. Can anyone explain the purpose of the rest of the padding? This looks like more of a gcc-help question, trying to move the thread there. Unless you over-ride defaults with -mpreferred-stack boundary (or -Os, which probably implies a change in stack boundary), or ask for a change on the basis of making a leaf function, you are generating alignment compatible with the use of SSE parallel instructions. The stack, then, must be 16-byte aligned before entry and at exit, and also a buffer of 16 bytes or more must be 16-byte aligned. I believe there is a move afoot to standardize the treatment for the most common x86 32-bit targets; that was done at the beginning for 64-bit. Don't know if you are using x86 to imply 32-bit, in accordance with Windows terminology.
Re: Upgrade to GCC.4.3.2
Philipp Thomas wrote: > On Sun, 28 Dec 2008 14:24:22 -0500, you wrote: > >> I have SLES9 and Linux-2.6.5-7.97 kernel install on i586 intel 32 bit >> machine. The compiler is gcc-c++3.3.3-43.24. I want to upgrade to >> GCC4.3.2. My question are: Would this upgrade work with >> SLES9? > > This is the wrong list for such questions. You should try a SUSE > specific list like opens...@opensuse.org or > opensuse-programm...@opensuse.org gcc-help is a reasonable choice as well.
Re: gcc binary download
Tobias Burnus wrote: > > Otherwise, you could consider building GCC yourself, cf. > http://gcc.gnu.org/install/. (Furthermore, some gfortran developers > offer regular GCC builds, which are linked at > http://gcc.gnu.org/wiki/GFortranBinaries; those are all unofficial > builds, come without any warrantee/support, and due to, e.g., library > issues they may not work on your system.) > I believe the wiki builds include C and Fortran, but not C++, in view of the additional limitations in supporting a new g++ on a reasonable range of targets. Even so, there may be minimum requirements on glibc and binutils versions.
Re: Binary Autovectorization
Rodrigo Dominguez wrote: > I am looking at binary auto-vectorization or taking a binary and rewriting > it to use SIMD instructions (either statically or dynamically). That's a tall order, considering how much source level dependency information is needed. I don't know whether proprietary binary translation projects currently under way promise to add vectorization, or just to translate SIMD vector code to new ISA.
Re: -mfpmath=sse,387 is experimental ?
Zuxy Meng wrote: > Hi, > > "Timothy Madden" 写入消息 ! >> I am sure having twice the number of registers (sse+387) would make a >> big difference. You're not counting the rename registers, you're talking about 32-bit mode only, and you're discounting the different mode of accessing the registers. >> >> How would I know if my AMD Sempron 2200+ has separate execution units >> for SSE and >> FPU instructions, with independent registers ? > > Most CPU use the same FP unit for both x87 and SIMD operations so it > wouldn't give you double the performance. The only exception I know of > is K6-2/3, whose x87 and 3DNow! units are separate. > -march=pentium-m observed the preference of those CPUs for mixing the types of code. This was due more to the limited issue rate for SSE instructions than to the expanded number of registers in use. You are welcome to test it on your CPU; however, AMD CPUs were designed to perform well with SSE alone, particularly in 64-bit mode.
Re: GCC 4.4.0 Status Report (2009-03-13)
Chris Lattner wrote: > > On Mar 23, 2009, at 8:02 PM, Jeff Law wrote: > >> Chris Lattner wrote: >>>>> >>>> These companies really don't care about FOSS in the same way GCC >>>> developers do. I'd be highly confident that this would still be a >>>> serious issue for the majority of the companies I've interacted with >>>> through the years. >>> >>> Hi Jeff, >>> >>> Can you please explain the differences you see between how GCC >>> developers and other people think about FOSS? I'm curious about your >>> perception here, and what basis it is grounded on. >>> >> I'd divide customers into two broad camps. Both camps are extremely >> pragmatic, but they're focused on two totally different goals. > > Thanks Jeff, I completely agree with you. Those camps are very common > in my experience as well. Do you consider GCC developers to fall into > one of these two categories, or do you see them as having a third > perspective? I know that many people have their own motivations and > personal agenda (and it is hard to generalize) but I'm curious what you > meant above. > > Thanks! > > -Chris > >> >> >> The first camp sees FOSS toolkits as a means to help them sell more >> widgets, typically processors & embedded development kits. Their >> belief is that a FOSS toolkit helps build a developer eco-system >> around their widget, which in turn spurs development of consumable >> devices which drive processor & embedded kit sales. The key for >> these guys is free, as in beer, widely available tools. The fact that >> the compiler & assorted utilities are open-source is largely irrelevant. >> >> The second broad camp I run into regularly are software developers >> themselves building applications, most often for internal use, but >> occasionally they're building software that is then licensed to their >> customers. They'd probably describe the compiler & associated >> utilities as a set of hammers, screwdrivers and the like -- they're >> just as happy using GCC as any other compiler so long as it works. >> The fact that the GNU tools are open source is completely irrelevant >> to these guys. They want to see standards compliance, abi >> interoperability, and interoperability with other tools (such as >> debuggers, profilers, guis, etc). They're more than willing to swap >> out one set of tools for another if it gives them some advantage. >> Note that an advantage isn't necessarily compile-time or runtime >> performance -- it might be ease of use, which they believe allows >> their junior level engineers to be more effective (this has come up >> consistently over the last few years). >> >> Note that in neither case do they really care about the open-source >> aspects of their toolchain (or for the most part the OS either). >> They may (and often do) like the commoditization of software that FOSS >> tends to drive, but don't mistake that for caring about the open >> source ideals -- it's merely cost-cutting. >> >> Jeff >> >> > Software developers I deal with use gcc because it's a guaranteed included part of the customer platforms they are targeting. They're generally looking for a 20% gain in performance plus support before adopting commercial alternatives. The GUIs they use don't live up to the advertisements about ease of use. This doesn't necessarily put them in either of Jeff's camps. Tim
Re: Minimum GMP/MPFR version bumps for GCC-4.5
Kaveh R. Ghazi wrote: > What versions of GMP/MPFR do you get on > your typical development box and how old are your distros? > OpenSuSE 10.3 (originally released Oct. 07): gmp-devel-4.2.1-58 gmp-devel-32bit-4.2.1-58 mpfr-2.2.1-45
Re: heise.de comment on 4.4.0 release
Tobias Burnus wrote: > Toon Moene wrote: Can somebody with access to SPEC sources confirm / deny and file a bug report, if appropriate? I just started working on SPEC CPU2006 issues this week. > Seemingly yes. To a certain extend this was by accident as "-msse3" was > used, but it is on i586 only effective with -mfpmath=sse (that is not > completely obvious). By the way, my tests using the Polyhedron benchmark > show that for 32bit, x87 and SSE are similarly fast, depending a lot on > the test case thus it does not slow down the benchmark too much. Certain AMD CPUs had shorter latencies for scalar single precision sse, but generally the advantage of sse comes from vectorization. > > If I understood correctly, the 32bit mode was used since the 64bit mode > needs more than the available 2GB memory. Certain commercial compilers make an effort to switch to 32-bit mode automatically on several CPU2006 benchmarks, as they are too small to run as fast in 64-bit mode. > > Similarly, the option -funroll-loops was avoided as they expect that > unrolling badly interacts with the small cache Atom processors have. > (That CPU2006 runs that long, does not make testing different options > that easy.) I'm surprised that spec 2006 is considered relevant to Atom. The entire thing (base only) has been running under 10 hours on a dual quad core system. I've heard several times the sentiment that there ought to be an "official" harness to run a single test, trying various options. > I would have liked that the options were reported. For instance > -ffast-math was not used out of fear that it results in too imprecise > results causing SPEC to abort. (Admittedly, I'm also careful with that > option, though I assume that -ffast-math works for SPEC.) On the other > hand, certain flags implies by -ffast-math are already applied with -O1 > in some commercial compilers. SPEC probably has been the biggest driver for inclusion of insane options at default in commercial compilers. It's certainly not an example of acceptable practice in writing portable code. I have yet to find a compiler which didn't fail at least one SPEC test, and I don't blame the compilers. There are dependencies on unusual C++ extensions, which somehow weren't noticed before, examples of using "f77" as an excuse for hiding one's intentions, and expectations of optimizations which have little relevance for serious applications. > > David Korn wrote: >> They accused us of a too-hasty release. My irony meter exploded! Anyway, a fault in support for a not-so-open benchmark application seems even less relevant in an open source effort than it is to compilers which depend on ranking for sales success.
Re: Bootstrap broken by ppl/cloog config problem: finds non-system/non-standard "/include" dir
Dave Korn wrote: > > Heh, I was just about to post that, only I was looking at $clooginc rather > than $pplinc! The same problem exists for both; I'm pretty sure we should > fall back on $prefix if the --with option is empty. > When I bootstrapped gcc 4.5 on cygwin yesterday, configure recognized the newly installed ppl, but not the cloog. The bootstrap completed successfully, and I'm not looking a gift horse in the mouth.
Re: Bootstrap broken by ppl/cloog config problem: finds non-system/non-standard "/include" dir
Dave Korn wrote: > Tim Prince wrote: >> Dave Korn wrote: >> >>> Heh, I was just about to post that, only I was looking at $clooginc rather >>> than $pplinc! The same problem exists for both; I'm pretty sure we should >>> fall back on $prefix if the --with option is empty. >>> >> When I bootstrapped gcc 4.5 on cygwin yesterday, configure recognized the >> newly installed ppl, but not the cloog. The bootstrap completed >> successfully, and I'm not looking a gift horse in the mouth. > > You don't have a bogus /include dir, but I bet you'll find -I/include in > PPLINC. > > It would be interesting to know why it didn't spot cloog. What's in your > top-level $objdir/config.log? > #include no such file -I/include was set by configure. As you say, there is something bogus here. setup menu shows cloog installed in development category, but I can't find any such include file. Does this mean the cygwin distribution of cloog is broken?
Re: Bootstrap broken by ppl/cloog config problem: finds non-system/non-standard "/include" dir
Dave Korn wrote: Tim Prince wrote: #include no such file -I/include was set by configure. As you say, there is something bogus here. setup menu shows cloog installed in development category, but I can't find any such include file. Does this mean the cygwin distribution of cloog is broken? Did you make sure to get the -devel packages as well as the libs? That's the usual cause of this kind of problem. I highly recommend the new version of setup.exe that has a package-list search box :-) cheers, DaveK OK, I see there is a libcloog-devel in addition to the cloog Dev selection, guess that will fix it for cygwin. I tried to build cloog for IA64 linux as well, gave up on include file parsing errors.
Re: [Fwd: Failure in bootstrapping gfortran-4.5 on Cygwin]
Ian Lance Taylor wrote: Angelo Graziosi writes: The current snapshot 4.5-20090507 fails to bootstrap on Cygwin: It did bootstrap effortlessly for me, once I logged off to clear hung processes, with the usual disabling of strict warnings. I'll let testsuite run over the weekend.
Re: Link error ....redefinition of......
On Tuesday 02 June 2009 08:16:35 Alex Luya wrote: > I download source code for book < Analysis in C++ (Second Edition), /by Mark Allen Weiss>> > from:http://users.cs.fiu.edu/~weiss/dsaa_c++/code/,try to compiler > it,but got many errors,most of them say: > .. previously declared here > ...: redefinition of . > > I think template causes these errors,but how to fix it. This is not the correct mailing list for such questions! Nevertheless, the reason for your compile errors is a simple one. Just drop the line #include "StackAr.cpp" from your header file. Why are you trying to include the implementation in the header? The other way round is how things work! (And you do have the header include in your implementation - why both directions?) > --- > My configuration: > Ubuntu 9.04 > GCC version 4.3.3 (Ubuntu 4.3.3-5ubuntu4) > Eclipse 3.4 > CDT:.5.0.2 > - > > Files and error message are following: > > StackAr.h > - > #ifndef STACKAR_H > #define STACKAR_H > > #include "../vector.h" > #include "../dsexceptions.h" > > template > class Stack > { > public: > explicit Stack( int capacity = 10 ); > bool isEmpty( ) const; > . > #include "StackAr.cpp" > #endif > > -- > > StackAr.cpp > > #include "StackAr.h" > template > Stack::Stack( int capacity ) : theArray( capacity ) > { > topOfStack = -1; > } > > template > bool Stack::isEmpty( ) const > { > return topOfStack == -1; > } > ... > > Test.cpp > #include > #include "StackAr.h" > using namespace std; > > int main() > { > Stack s; > > for (int i = 0; i < 10; i++) > s.push(i); > > while (!s.isEmpty()) > cout << s.topAndPop() << endl; > return 0; > } > > > - > error message: > > Build of configuration Debug for project DACPP > > make all > Building file: ../src/stack/StackAr.cpp > Invoking: GCC C++ Compiler > g++ -O0 -g3 -Wall -c -fmessage-length=0 -MMD -MP > -MF"src/stack/StackAr.d" -MT"src/stack/StackAr.d" > -o"src/stack/StackAr.o" "../src/stack/StackAr.cpp" > ../src/stack/StackAr.cpp:7: erreur: redefinition of > ‘Stack::Stack(int)’ > ../src/stack/StackAr.cpp:7: erreur: ‘Stack::Stack(int)’ > previously declared here > ../src/stack/StackAr.cpp:17: erreur: redefinition of ‘bool > Stack::isEmpty() const’ > ../src/stack/StackAr.cpp:17: erreur: ‘bool Stack::isEmpty() > const’ previously declared here > ... -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> <> <> Tim München, M.Sc.muenc...@physik.uni-wuppertal.de <> <> Bergische Universitaet <> <> FB C - Physik Tel.: +49 (0)202 439-3521 <> <> Gaussstr. 20 Fax : +49 (0)202 439-2811 <> <> 42097 Wuppertal <> <> <> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
Re: Failure building current 4.5 snapshot on Cygwin
Angelo Graziosi wrote: > I want to flag the following failure I have seen on Cygwin 1.5 trying to > build current 4.5-20090625 gcc snapshot: > checking whether the C compiler works... configure: error: in > `/tmp/build/intl': > configure: error: cannot run C compiled programs. > If you meant to cross compile, use `--host'. > See `config.log' for more details. I met the same failure on Cygwin 1.7 with yesterday's and last week's snapshots. I didn't notice that it refers to intl/config.log, so will go back and look, as you didn't show what happened there. On a slightly related subject, I have shown that the libgfortran.dll.a and libgomp.dll.a are broken on cygwin builds, including those released for cygwin, as shown on the test case I submitted on cygwin list earlier this week. The -enable-shared has never been satisfactory for gfortran cygwin.
Re: Failure building current 4.5 snapshot on Cygwin
Dave Korn wrote: Angelo Graziosi wrote: I want to flag the following failure I have seen on Cygwin 1.5 trying to build current 4.5-20090625 gcc snapshot: So what's in config.log? And what binutils are you using? cheers, DaveK In my case, it says no permission to execute a.exe. However, I can run the intl configure and make from command line. When I do that, and attempt to restart stage 2, it stops in liberty, and again I have to execute steps from command line.
Re: Failure building current 4.5 snapshot on Cygwin
Kai Tietz wrote: 2009/6/26 Seiji Kachi : Angelo Graziosi wrote: Dave Korn ha scritto: Angelo Graziosi wrote: I want to flag the following failure I have seen on Cygwin 1.5 trying to build current 4.5-20090625 gcc snapshot: So what's in config.log? And what binutils are you using? The config logs are attached, while binutils is the current in Cygwin-1.5, i.e. 20080624-2. Cheers, Angelo. I have also seen similar faulure, and the reason on my environment is as follows. (1) In my case, gcc build complete successfully. But a.exe which is compiled from the new compiler fails. Error message is $ ./a.exe bash: ./a.exe: Permission denied Source code of a.exe is quite simple: main() { printf("Hello\n"); } (2) This failuer occurres from gcc trunk r148408. r148407 is OK. (3) r148408 removed "#ifdef DEBUG_PUBTYPES_SECTION". r148407 does not generate debug_pubtypes section, but r148408 and later version generates debug_pubtypes section in object when we set debug option. (4) gcc build sequence usually uses debug option. (5) My cygwin environment seems not to accept debug_pubtypes section, and pop up "Permission denied" error. When I reverted "#ifdef DEBUG_PUBTYPES_SECTION" in dearf2out.c, the failuer disappeared. Does this failure occurr only on cygwin? Regards, Seiji Kachi No, this bug appeared on all windows pe-coff targets. A fix for this is already checked in yesterday on binutils. Could you try it with the current binutils head version? Cheers, Kai Is this supposed to be sufficient information for us to find that binutils? I may be able to find an insider colleague, otherwise I would have no chance.
Re: Failure building current 4.5 snapshot on Cygwin
Kai Tietz wrote: 2009/6/26 Tim Prince : Kai Tietz wrote: 2009/6/26 Seiji Kachi : Angelo Graziosi wrote: Dave Korn ha scritto: Angelo Graziosi wrote: I want to flag the following failure I have seen on Cygwin 1.5 trying to build current 4.5-20090625 gcc snapshot: So what's in config.log? And what binutils are you using? The config logs are attached, while binutils is the current in Cygwin-1.5, i.e. 20080624-2. Cheers, Angelo. I have also seen similar faulure, and the reason on my environment is as follows. (1) In my case, gcc build complete successfully. But a.exe which is compiled from the new compiler fails. Error message is $ ./a.exe bash: ./a.exe: Permission denied Source code of a.exe is quite simple: main() { printf("Hello\n"); } (2) This failuer occurres from gcc trunk r148408. r148407 is OK. (3) r148408 removed "#ifdef DEBUG_PUBTYPES_SECTION". r148407 does not generate debug_pubtypes section, but r148408 and later version generates debug_pubtypes section in object when we set debug option. (4) gcc build sequence usually uses debug option. (5) My cygwin environment seems not to accept debug_pubtypes section, and pop up "Permission denied" error. When I reverted "#ifdef DEBUG_PUBTYPES_SECTION" in dearf2out.c, the failuer disappeared. Does this failure occurr only on cygwin? Regards, Seiji Kachi No, this bug appeared on all windows pe-coff targets. A fix for this is already checked in yesterday on binutils. Could you try it with the current binutils head version? Cheers, Kai Is this supposed to be sufficient information for us to find that binutils? I may be able to find an insider colleague, otherwise I would have no chance. Hello, you can find the binutils project as usual under http://sources.redhat.com/binutils/ . You can find on this page how you are able to get current cvs version of binutils. This project contains the gnu tools, like dlltool, as, objcopy, ld, etc. The issue you are running in is reasoned by a failure in binutils about setting correct section flags for debugging sections. By the last change in gcc - it was the output of the .debug_pubtypes secton - this issue was shown. There is a patch already applied to binutils's repository head, which should solve the issue described here in this thread. We from mingw-w64 were fallen already over this issue and have taken care. Cheers, Kai My colleague suggested building and installing last week's binutils release. I did so, but it didn't affect the requirement to run each stage 2 configure individually from command line. Thanks, Tim
Re: random numbers
ecrosbie wrote: how do I generate random numbers in a f77 program? Ed Crosbie
Re: random numbers
ecrosbie wrote: how do I generate random numbers in a f77 program? Ed Crosbie This subject isn't topical on the gcc development forum. If you wish to use a gnu Fortran random number generator, please consider gfortran, which implements the language standard random number facility. http://gcc.gnu.org/onlinedocs/gcc-4.4.0/gfortran/ questions might be asked on the gfortran list (follow-up set) or comp.lang.fortran In addition, you will find plenty of other advice by using your web browser.
Re: optimizing a DSO
On 5/28/2010 11:14 AM, Ian Lance Taylor wrote: Quentin Neill writes: A little off topic, but by what facility does the compiler know the linker (or assembler for that matter) is gnu? When you run configure, you can specify --with-gnu-as and/or --with-gnu-ld. If you do, the compiler will assume the GNU assembler or linker. If you do not, the compiler will assume that you are not using the GNU assembler or linker. In this case the compiler will normally use the common subset of command line options supported by the native assembler and the GNU assembler. In general that only affects the compiler behaviour on platforms which support multiple assemblers and/or linkers. E.g., on GNU/Linux, we always assume the GNU assembler and linker. There is an exception. If you use --with-ld, the compiler will run the linker with the -v option and grep for GNU in the output. If it finds it, it will assume it is the GNU linker. The reason for this exception is that --with-ld gives a linker which will always be used. The assumption when no specific linker is specified is that you might wind up using any linker available on the system, depending on the value of PATH when running the compiler. Ian Is it reasonable to assume when the configure test reports using GNU linker, it has taken that "exception," even without a --with-ld specification? -- Tim Prince
Re: gcc command line exceeds 8191 when building in XP
On 7/19/2010 4:13 PM, IceColdBeer wrote: Hi, I'm building a project using GNU gcc, but the command line used to build each source file sometimes exceeds 8191 characters, which is the maximum supported command line length under Win XP.Even worst under Win 2000, where the maximum command line length is limited to 2047 characters. Can the GNU gcc read the build options from a file instead ?I have searched, but cannot find an option in the documentation. Thanks in advance, ICB redirecting to gcc-help. The gcc builds for Windows themselves use a scheme for splitting the link into multiple steps in order to deal with command line length limits. I would suggest adapting that. Can't study it myself now while travelling. -- Tim Prince
Re: x86 assembler syntax
On 8/8/2010 10:21 PM, Rick C. Hodgin wrote: All, Is there an Intel-syntax compatible option for GCC or G++? And if not, why not? It's so much cleaner than AT&T's. - Rick C. Hodgin I don't know how you get along without a search engine. What about http://tldp.org/HOWTO/Assembly-HOWTO/gas.html ? -- Tim Prince
Re: food for optimizer developers
On 8/10/2010 9:21 PM, Ralf W. Grosse-Kunstleve wrote: Most of the time is spent in this function... void dlasr( str_cref side, str_cref pivot, str_cref direct, int const& m, int const& n, arr_cref c, arr_cref s, arr_ref a, int const& lda) in this loop: FEM_DOSTEP(j, n - 1, 1, -1) { ctemp = c(j); stemp = s(j); if ((ctemp != one) || (stemp != zero)) { FEM_DO(i, 1, m) { temp = a(i, j + 1); a(i, j + 1) = ctemp * temp - stemp * a(i, j); a(i, j) = stemp * temp + ctemp * a(i, j); } } } a(i, j) is implemented as T* elems_; // member T const& operator()( ssize_t i1, ssize_t i2) const { return elems_[dims_.index_1d(i1, i2)]; } with ssize_t all[Ndims]; // member ssize_t origin[Ndims]; // member size_t index_1d( ssize_t i1, ssize_t i2) const { return (i2 - origin[1]) * all[0] + (i1 - origin[0]); } The array pointer is buried as elems_ member in the arr_ref<> class template. How can I apply __restrict in this case? Do you mean you are adding an additional level of functions and hoping for efficient in-lining? Your programming style is elusive, and your insistence on top posting will make this thread difficult to deal with. The conditional inside the loop likely is even more difficult for C++ to optimize than Fortran. As already discussed, if you don't optimize otherwise, you will need __restrict to overcome aliasing concerns among a,c, and s. If you want efficient C++, you will need a lot of hand optimization, and verification of the effect of each level of obscurity which you add. How is this topic appropriate to gcc mail list? -- Tim Prince
Re: End of GCC 4.6 Stage 1: October 27, 2010
On 9/6/2010 9:21 AM, Richard Guenther wrote: On Mon, Sep 6, 2010 at 6:19 PM, NightStrike wrote: On Mon, Sep 6, 2010 at 5:21 AM, Richard Guenther wrote: On Mon, 6 Sep 2010, Tobias Burnus wrote: Gerald Pfeifer wrote: Do you have a pointer to testresults you'd like us to use for reference? From our release criteria, for secondary platforms we have: • The compiler bootstraps successfully, and the C++ runtime library builds. • The DejaGNU testsuite has been run, and a substantial majority of the tests pass. See for instance: http://gcc.gnu.org/ml/gcc-testresults/2010-09/msg00295.html There are no libstdc++ results in that. Richard. This is true. I always run make check-gcc. What should I be doing instead? make -k check make check-c++ runs both g++ and libstdc++-v3 testsuites. -- Tim Prince
Re: Turn on -funroll-loops at -O3?
On 1/21/2011 10:43 AM, H.J. Lu wrote: Hi, SInce -O3 turns on vectorizer, should it also turn on -funroll-loops? Only if a conservative default value for max-unroll-times is set 2<= value <= 4 -- Tim Prince
Re: Why doesn't vetorizer skips loop peeling/versioning for target supports hardware misaligned access?
On 1/24/2011 5:21 AM, Bingfeng Mei wrote: Hello, Some of our target processors support complete hardware misaligned memory access. I implemented movmisalignm patterns, and found TARGET_SUPPORT_VECTOR_MISALIGNMENT (TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT On 4.6) hook is based on checking these patterns. Somehow this hook doesn't seem to be used. vect_enhance_data_refs_alignment is called regardless whether the target has HW misaligned support or not. Shouldn't using HW misaligned memory access be better than generating extra code for loop peeling/versioning? Or at least if for some architectures it is not the case, we should have a compiler hook to choose between them. BTW, I mainly work on 4.5, maybe 4.6 has changed. Thanks, Bingfeng Mei Peeling for alignment still presents a performance advantage on longer loops for the most common current CPUs. Skipping the peeling is likely to be advantageous for short loops. I've noticed that 4.6 can vectorize loops with multiple assignments, presumably taking advantage of misalignment support. There's even a better performing choice of instructions for -march=corei7 misaligned access than is taken by other compilers, but that could be an accident. At this point, I'd like to congratulate the developers for the progress already evident in 4.6. -- Tim Prince
supporting finer grained -Wextra
hi there. i just enabled -Wextra to catch broken if statements, i.e. to enable warnings on: * An empty body occurs in an if or else statement. however this unfortunately triggers other warnings that i can't reasonably get rid of. here's a test snippet: == test.c == typedef enum { VAL0 = 0, VAL1 = 1, VAL2 = 2 } PositiveEnum; int main (int argc, char *argv[]) { PositiveEnum ev = VAL1; unsigned int limit = VAL2; if (ev >= 0) {} if (ev <= limit) {} if (1) ; return 0; } == test.c == compiled as C code, this will produce (with a 4.2 snapshot): $ gcc -Wall -Wextra -Wno-unused -x c test.c test.c: In function 'main': test.c:6: warning: comparison of unsigned expression >= 0 is always true test.c:11: warning: empty body in an if-statement and compiled as C++ code: $ gcc -Wall -Wextra -Wno-unused -x c++ test.c test.c: In function 'int main(int, char**)': test.c:8: warning: comparison between signed and unsigned integer expressions test.c:11: warning: empty body in an if-statement that means, for a header file that is used for C and C++ code, i simply can't avoid one of those enum signedness warnings. simply because the enum is treated as unsigened in C and as signed in C++. now, aparently the enum related signedness warnings are unconditionally enabled by -Wextra, as is the if-related warnings, from the man-page: -Wextra [...] Print extra warning messages for these events: * An unsigned value is compared against zero with < or >=. * An empty body occurs in an if or else statement. since the enum related signedness warnings are clearly bogus [1], i'd like to request new warning options, that allow enabling/disabling of the empty-body vs. unsigned-zero-cmp warnings independently. that way i can preserve "empty body in an if-statement" while getting rid of the useless enum comparison warnings. [1] i'm aware that the enum signedness comparison warnings could be worked around by explicit casts or by adding a negative enum value. this would just create new problems though, because aside from worsen readability, casts tend to "blinden" the compiler with regards to whole classes of other bugs, and adding a dummy enum value would affect API and auto generated documentation. --- ciaoTJ
Re: Problem with type safety and the "sentinel" attribute
thanks for the quick response Kaveh. On Fri, 9 Jun 2006, Kaveh R. Ghazi wrote: > void print_string_array (const char *array_name, > const char *string, ...) __attribute__ > ((__sentinel__)); > > print_string_array ("empty_array", NULL); /* gcc warns, but shouldn't */ > > The only way out for keeping the sentinel attribute and avoiding the > warning is using > > static void print_string_array (const char *array_name, ...) > __attribute__ ((__sentinel__)); I think you could maintain typesafety and silence the warning by keeping the more specific prototype and adding an extra NULL, e.g.: print_string_array ("empty_array", NULL, NULL); Doesn't seem elegant, but it does the job. this is an option for a limited set of callers, yes. > By the way, there is already an existing gcc bug, which is about the > same thing (NULL passed within named args), but wants to have it the > way it works now: > > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21911 Correct, the feature as I envisioned it expected the sentinel to appear in the variable arguments only. This PR reflects when I found out it didn't do that and got around to fixing it. Note the "buggy" behavior wasn't exactly what you wanted either because GCC got fooled by a sentinel in *any* of the named arguments, not just the last one. > so if it gets changed, then gcc might need to support both > - NULL termination within the last named parameter allowed > - NULL termination only allowed within varargs parameters (like it is >now) I'm not against this enhancement, but you need to specify a syntax that allows the old behavior but also allows doing it your way. Hmm, perhaps we could check for attribute "nonnull" on the last named argument, if it exists then that can't be the sentinel, if it's not there then it does what you want. This is not completely backwards compatible because anyone wanting the existing behavior has to add the attribute nonnull. But there's precedent for this when attribute printf split out whether the format specifier could be null... We could also create a new attribute name for the new behavior. This would preserve backwards compatibility. I like this idea better. i agree here. as far as the majority of the GLib and Gtk+ APIs are concerned, we don't really need the flexibility of the sentinel attribute but rather a compiler check on whether the last argument used in a function call is NULL or 0 (regardless of whether it's the last named arg or already part of the varargs list). that's also why the actual sentinel wrapper in GLib looks like this: #define G_GNUC_NULL_TERMINATED __attribute__((__sentinel__)) so, if i was to make a call on this issue, i'd either introduce __attribute__((__null_terminated__)) with the described semantics, or have __attribute__((__sentinel__(-1))) work essentially like __attribute__((__sentinel__(0))) while also accepting 0 in the position of the last named argument. Next you need to recruit someone to implement this enhancement, or submit a patch. :-) Although given that you can silence the warning by adding an extra NULL at the call site, I'm not sure it's worth it. i would say this is definitely worth it, because the issue also shows up in other code that is widely used: gpointerg_object_new (GType object_type, const gchar*first_property_name, ...); that's for instance a function which is called in many projects. putting the burden on the caller is clearly the wrong trade off here. so please take this as a vote for the worthiness of a fix ;) --Kaveh -- Kaveh R. Ghazi [EMAIL PROTECTED] --- ciaoTJ
Re: Are 8-byte ints guaranteed?
Thomas Koenig wrote: Hello world, are there any platforms where gcc doesn't support 8-byte ints? Can a front end depend on this? This would make life easier for Fortran, for example, because we could use INTEGER(KIND=8) for a lot of interfaces without having to bother with checks for the presence of KIND=8 integers. No doubt, there are such platforms, although I doubt there is sufficient interest in running gfortran on them. Support for 64-bit integers on common 32-bit platforms is rather inefficient, when it is done by pairs of 32-bit integers.
Re: g77 problem for octave
[EMAIL PROTECTED] wrote: Dear Sir/Madame, I have switched my OS to SuSE Linux 10.1 and for a while trying to install "Octave" to my computer. Unfortunately, the error message below is the only thing that i got. Installing octave-2.1.64-3.i586[Local packages] There are no installable providers of gcc-g77 for octave-2.1.64-3.i586[Local packages] On my computer, the installed version of gcc is 4.1.0-25 and i could not find any compatible version of g77 to install. For the installation of octave, i need exactly gcc-g77 not gcc-fortran. Can you please help me to deal with this problem? If you are so interested in using g77 rather than gfortran, it should be easy enough to grab gcc-3.4.x sources and build g77. One would wonder why you dislike gfortran so much.
libgomp: Thread creation failed: Invalid argument
I am very happy to see that gfortran from current gcc snapshots can successfully compile an 18000 lines Fortran 77 numerics program I wrote. Results are indeed the same as obtained with other compilers (g77, PGI, ifort), and also execution speed seems roughly comparable, although I haven't yet done any precise measurements. A big thank you to the developers for that! Now I am trying to get the program to run with OpenMP, which works (although slower than anticipated) with PGI and ifort compilers. While I can successfully build and execute small OpenMP test programs, starting my large program fails with the message libgomp: Thread creation failed: Invalid argument resulting from a failing call to pthread_create() in libgomp/team.c. Using gdb I see that pthread_create() is called with the same gomp_thread_attr argument as for the smaller, succeeding testcases. strace shows that pthread_create() fails without trying to call clone(), while the clone() call of course does happen for the succeeding testcases. How to further debug this problem? I am currently using gcc-4.2-20060812 on i686 and x86_64 SuSE 10.0 Linux systems. Thank you, Tim