Re: Question about code licensing
I think the main reason is that DMD front end sources are dual licensed with GPL and Artistic License. The DMD backend is not under an open source license (personal use only), so the Artistic License is how the two are integrated. The fork is required to allow DMD to continue under its current license scheme. It also means that fixes to the GCC front end would not be copyable to the DMD front end going forward. Strictly speaking, that's not true. Even if the submitter would still be required to have copyright assignment for the FSF, they could be copyable to the DMD front-end _as long as the submitter himself sends them for inclusion there too_. This is the practical significance of the license grantback from the FSF to the author. I'm not sure whether it suffices to otherwise specify the intention to release the changes under the dual license in the message, and I don't want to imply in any way that this is possible since IANAL. That said, 1) I don't think the FSF would be very happy; 2) the question still stands of whether/how to assign copyright to the FSF for changes before the inclusion in the gcc.gnu.org repository. A related topic is this: when is the copyright assigned to the FSF for a particular patch---for example, when the patch is posted or when it is committed?(*) In other words, do I have to ask the poster for permission if I want to get into GCC a patch that was sent to the mailing list but never committed? (*) And how does this change when the submitter doesn't have write access to the repository)? Paolo
Re: int vs. bool / _Bool (Was: Re: Committed: Fix distribute_loop)
On 01/23/2010 04:29 PM, Richard Guenther wrote: We could warn about this when building with C++ but with C we do not see bools but ints here. With such a warning there would be no reason not to build stage2 and stage3 with bool == _Bool. Paolo
Re: Question about code licensing
On Sun, Jan 24, 2010 at 07:00:44AM -0800, Paolo Bonzini wrote: > > I think the main reason is that DMD front end sources are dual licensed > > with GPL and Artistic License. The DMD backend is not under an open > > source license (personal use only), so the Artistic License is how the > > two are integrated. The fork is required to allow DMD to continue under > > its current license scheme. > > > > It also means that fixes to the GCC front end would not be copyable to > > the DMD front end going forward. > > Strictly speaking, that's not true. Even if the submitter would still > be required to have copyright assignment for the FSF, they could be > copyable to the DMD front-end _as long as the submitter himself sends > them for inclusion there too_. This is the practical significance of > the license grantback from the FSF to the author. This is getting off-topic for this list. Still, if this were the plan (and I don't know whether it is or not), I think that the FSF would reject it, because it would implicitly ask all GCC developers to help out with a proprietary product. There would also be a huge conflict-of-interest issue if the official maintainer of the D front end were in a position to accept or reject patches based not on their technical merit, but on whether the contributor agrees to separately contribute them under the dual-license scheme, and his/her employer had an interest in this issue.
Re: Question about code licensing
Strictly speaking, that's not true. Even if the submitter would still be required to have copyright assignment for the FSF, they could be copyable to the DMD front-end _as long as the submitter himself sends them for inclusion there too_. This is the practical significance of the license grantback from the FSF to the author. This is getting off-topic for this list. Still, if this were the plan (and I don't know whether it is or not), I think that the FSF would reject it, because it would implicitly ask all GCC developers to help out with a proprietary product. Yes, this is what I meant by the FSF not liking it. There would also be a huge conflict-of-interest issue if the official maintainer of the D front end were in a position to accept or reject patches based not on their technical merit, but on whether the contributor agrees to separately contribute them under the dual-license scheme, and his/her employer had an interest in this issue. This only makes it worse. Paolo
Re: speed of double-precision divide
Richard, Could you provide us with a good reference for the latencies and other speed issues of SSE operations? What I've found is scattered and hard to compare. Frankly, I was under the misconception that each of these SSE operatons was meant to be accomplished in a single clock cycle (although I knew there are various other issues.) Cheers! On 23.01.10, Richard Guenther wrote: > On Sat, Jan 23, 2010 at 6:33 PM, Steve White wrote: > > Hi, Andrew! > > ... > > > > Nevermind icc for the moment, with whatever trick it may be doing. > > Why is the SSE2 division so slow, compared to multiplication? > > > > Change one character in the division test to make a multiplication test. > > It is an order of magnitude difference in speed. > > It's because multiplication latency is like 4 cycles while division is about > 20, also one mutliplication can be issued per cycle while only every > 17th instruction can be a division (AMD Fam10 values). > > GCC performs loop interchange with -ftree-loop-linear but the pass > is scheduled in an unfortunate place so no further optimization happens. > > Richard. > -- | - - - - - - - - - - - - - - - - - - - - - - - - - | Steve White +49(331)7499-202 | e-Science / AstroGrid-D Zi. 35 Bg. 20 | - - - - - - - - - - - - - - - - - - - - - - - - - | Astrophysikalisches Institut Potsdam (AIP) | An der Sternwarte 16, D-14482 Potsdam | | Vorstand: Prof. Dr. Matthias Steinmetz, Peter A. Stolz | | Stiftung privaten Rechts, Stiftungsverzeichnis Brandenburg: III/7-71-026 | - - - - - - - - - - - - - - - - - - - - - - - - -
Re: speed of double-precision divide
On Sun, Jan 24, 2010 at 10:32 PM, Steve White wrote: > Richard, > > Could you provide us with a good reference for the latencies and other > speed issues of SSE operations? What I've found is scattered and hard > to compare. > > Frankly, I was under the misconception that each of these SSE operatons > was meant to be accomplished in a single clock cycle (although I knew there > are various other issues.) Both Intel and AMD list them in their optimization and/or instruction reference guides. I suppose wikipedia might even link to the relevant pdfs (though google should also find them). Richard.
Re: speed of double-precision divide
Steve White wrote: I was under the misconception that each of these SSE operatons was meant to be accomplished in a single clock cycle (although I knew there are various other issues.) Current CPU architectures permit an SSE scalar or parallel multiply and add instruction to be issued on each clock cycle. Completion takes at least 4 cycles for add, significantly more for multiply. The instruction timing tables quote throughput (how many cycles between issue) and latency (number of cycles to complete an individual operation). An even more common misconception than yours is that the extra time taken to complete multiply, compared with the time of add, would disappear with fused multiply-add instructions. SSE divide, as has been explained, is not pipelined. The best way to speed up a loop with divide is with vectorization, barring situations such as the one you brought up where divide may not actually be a necessary part of the algorithm.
gcc-4.3-20100124 is now available
Snapshot gcc-4.3-20100124 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20100124/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.3 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_3-branch revision 156198 You'll find: gcc-4.3-20100124.tar.bz2 Complete GCC (includes all of below) gcc-core-4.3-20100124.tar.bz2 C front end and core compiler gcc-ada-4.3-20100124.tar.bz2 Ada front end and runtime gcc-fortran-4.3-20100124.tar.bz2 Fortran front end and runtime gcc-g++-4.3-20100124.tar.bz2 C++ front end and runtime gcc-java-4.3-20100124.tar.bz2 Java front end and runtime gcc-objc-4.3-20100124.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.3-20100124.tar.bz2The GCC testsuite Diffs from 4.3-20100117 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.3 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Successful make profiledbootstrap of GCC 4.4.3 and GCC 4.5.0 (SVN revision 156177) on Snow Leopard 10.6.2 x86_64-apple-darwin10.2.0
This is to report a successful build of GCC 4.4.3 and GCC 4.5.0 (156177) on my Macbook6,1. ./config.guess: i386-apple-darwin10.2.0 gcc -v: For GCC 4.4.3: Using built-in specs. Target: x86_64-apple-darwin10.2.0 Configured with: ../gcc-4.4.3/configure --prefix=/Users/olexa/Documents/MyApps/Commands/GCC/MacOSX-stable --with-pkgversion='GNU GCC 4.4.3 Codename Hallelujah GCC built Jan 21, 2009 with GMP 4.3.2, MPFR 2.4.2, MPC 0.8.1 and Libelf 0.8.13, bootstrap by GNU GCC 4.4.3 Codename Bootstrapper GCC' --with-libelf=/Users/olexa/Documents/MyApps/Commands/GCC/Dependencies/libelf --enable-lto --with-mpc=/Users/olexa/Documents/MyApps/Commands/GCC/Dependencies/mpc --with-mpfr=/Users/olexa/Documents/MyApps/Commands/GCC/Dependencies/mpfr --with-gmp=/Users/olexa/Documents/MyApps/Commands/GCC/Dependencies/gmp --enable-shared --enable-static --target=x86_64-apple-darwin10.2.0 --build=x86_64-apple-darwin10.2.0 --host=x86_64-apple-darwin10.2.0 --enable-threads --enable-languages=c,c++,fortran,objc,obj-c++ --enable-werror --enable-checking --enable-stage1-checking --disable-nls --disable-build-with-cxx --enable-gather-detailed-mem-stats --enable-decimal-float --with-tune=core2 CC='gcc -O3' CPPFLAGS=-O3 Thread model: posix gcc version 4.4.3 (GNU GCC 4.4.3 Codename Hallelujah GCC built Jan 21, 2009 with GMP 4.3.2, MPFR 2.4.2, MPC 0.8.1 and Libelf 0.8.13, bootstrap by GNU GCC 4.4.3 Codename Bootstrapper GCC) For GCC 4.5.0 revision 156177: Using built-in specs. COLLECT_GCC=./gcc COLLECT_LTO_WRAPPER=/Users/olexa/Documents/MyApps/Commands/GCC/MacOSX-instable/libexec/gcc/x86_64-apple-darwin10.2.0/4.5.0/lto-wrapper Target: x86_64-apple-darwin10.2.0 Configured with: ../gcc-svn-156177/configure --prefix=/Users/olexa/Documents/MyApps/Commands/GCC/MacOSX-instable --with-pkgversion='GNU GCC (4.5.0 - SVN revision 156177) Codename Mjolnir built Jan 22, 2009 with GMP 4.3.2, MPFR 2.4.2, MPC 0.8.1 and Libelf 0.8.13, bootstrap by GNU GCC 4.4.3 Codename Hallelujah GCC' --with-libelf=/Users/olexa/Documents/MyApps/Commands/GCC/Dependencies/libelf --disable-lto --with-mpc=/Users/olexa/Documents/MyApps/Commands/GCC/Dependencies/mpc --with-mpfr=/Users/olexa/Documents/MyApps/Commands/GCC/Dependencies/mpfr --with-gmp=/Users/olexa/Documents/MyApps/Commands/GCC/Dependencies/gmp --enable-shared --enable-static --target=x86_64-apple-darwin10.2.0 --build=x86_64-apple-darwin10.2.0 --host=x86_64-apple-darwin10.2.0 --enable-threads --enable-languages=c,c++,fortran,objc,obj-c++ --enable-werror --enable-checking --enable-stage1-checking --disable-nls --disable-build-with-cxx --enable-gather-detailed-mem-stats --enable-decimal-float --with-tune=core2 Thread model: posix gcc version 4.5.0 20100122 (experimental) (GNU GCC (4.5.0 - SVN revision 156177) Codename Mjolnir built Jan 22, 2009 with GMP 4.3.2, MPFR 2.4.2, MPC 0.8.1 and Libelf 0.8.13, bootstrap by GNU GCC 4.4.3 Codename Hallelujah GCC) Other relevant information: uname -a: Darwin *-**-MacBook.local 10.2.0 Darwin Kernel Version 10.2.0: Tue Nov 3 10:37:10 PST 2009; root:xnu-1486.2.11~1/RELEASE_I386 i386 System Specs: MacBook6,1 (Late 2009) Standard Configuration Mac OS X 10.6.2 Intel Core 2 Duo 2.26 GHz 2GB DDR3 1066 MHz RAM Xcode 3.2.1 (GCC 4.2.1 (1), Apple Inc.) Request for update to system-specific installation instructions: I am on Mac OS X Snow Leopard. There has been some noise around the forums that GCC fails for various reasons. It turns out that despite having all the requirements to run 64-bit systems, including a 64-bit processor (an Intel Core 2 Duo), no Macs boot the 64-bit kernel by default and only 4 (These being Mac Pros, Xserves, Macbook Pros and iMacs) are allowed to boot it at all, leaving that the system kernel runs 32-bit, but almost every application is 64-bit. Also, the command uname returns i386 despite the Core 2 Duo being more of a i686 or x86_64, a 64-bit processor. It seems therefore that config.guess confuses itself in bitness, picking the 32-bit version called i386 rather than the correct choice, x86_64. So the recommendation is to add an entry under Build Stats saying that 4.4.3 under x86_64-apple-darwin10.2.0 is successful, and to add to the system-specific installation notes to add these options to the configure: --host=x86_64-apple-darwin10 --target=x86_64-apple-darwin10 --host=x86_64-apple-darwin10 This essentially forces configure to choose the 64-bit flavour. I selected core2 because I am not compiling executables for any other processors - Macs have all switched to either Core 2 Duo, Core 2 Quad, or, in high-ends, to i5 and i7 with Intel Xeon or Nehalem architecture. I experienced problems with --enable-lto, bizarrely, so I had to turn it off, with --disable-lto. This goes both for 4.4.3 and 4.5.0.