Re: Inefficient loop unrolling.
Bingfeng Mei wrote: Steven, I just created a bug report. You should receive a CCed mail now. I can see these issues are solvable at RTL-level, but require lots of efforts. The main optimization in loop unrolling pass, split iv, can reduce dependence chain but not extra ADDs and alias issue. What is the main reason that loop unrolling should belong to RTL level? Is it fundamental? No, it is just effectiveness of the code size expansion heuristics. Ivopts is already complex enough on the tree level, that doing it on RTL would be insane. But other low-level loop optimizations had already been written on the RTL level and since there were no compelling reasons, they were left there. That said, this is a bug -- fwprop should have folded the ADDs, at the very least. I'll look at the PR. Paolo
RE: Inefficient loop unrolling.
Paolo, Thanks for the reply. However, I am not sure it is a simple folding issue. For example, B1 = B + 4; = [A, B1] B2 = B + 8; = [A, B2] B3 = B + 12; = [A, B3] Should be transformed to C = A + B = [C, 4] = [C, 8] = [C, 12] Loop exit condition needs to be changed accordingly. BTW, I just added an experimental tree-level loop unrolling pass in my porting, right before ivopt pass. The results are very promising except a few quirky things, which I belive to be problem of ivopts. The produced assembly code is as good as maunal unrolling now. Cheers, Bingfeng -Original Message- From: Paolo Bonzini [mailto:[EMAIL PROTECTED] On Behalf Of Paolo Bonzini Sent: 10 July 2008 13:34 To: Bingfeng Mei Cc: Steven Bosscher; gcc@gcc.gnu.org Subject: Re: Inefficient loop unrolling. Bingfeng Mei wrote: > Steven, > I just created a bug report. You should receive a CCed mail now. > > I can see these issues are solvable at RTL-level, but require lots of > efforts. The main optimization in loop unrolling pass, split iv, can > reduce dependence chain but not extra ADDs and alias issue. What is the > main reason that loop unrolling should belong to RTL level? Is it > fundamental? No, it is just effectiveness of the code size expansion heuristics. Ivopts is already complex enough on the tree level, that doing it on RTL would be insane. But other low-level loop optimizations had already been written on the RTL level and since there were no compelling reasons, they were left there. That said, this is a bug -- fwprop should have folded the ADDs, at the very least. I'll look at the PR. Paolo
[lto] Bootstrap failure
Is this the bootstrap failure that you folks were discussing in another thread? Is anyone fixing this? /home/dnovillo/perf/sbox/lto/local.x86_64/src/gcc/lto-function-in.c:1984: error: request for implicit conversion from 'void *' to 'union tree_node **' not permitted in C++ make[3]: *** [lto-function-in.o] Error 1 make[3]: *** Waiting for unfinished jobs rm gcj-dbtool.pod fsf-funding.pod jcf-dump.pod jv-convert.pod grmic.pod gcov.pod gcj.pod gc-analyze.pod gfdl.pod cpp.pod gij.pod gcc.pod gfortran.pod make[3]: Leaving directory `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld/gcc' make[2]: *** [all-stage2-gcc] Error 2 make[2]: Leaving directory `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld' make[1]: *** [stage2-bubble] Error 2 make[1]: Leaving directory `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld' make: *** [bootstrap] Error 2 Thanks. Diego.
Re: [lto] Bootstrap failure
2008/7/10 Diego Novillo <[EMAIL PROTECTED]>: > Is this the bootstrap failure that you folks were discussing in > another thread? Is anyone fixing this? I just committed Bill's patch with some small modifications. Should be bootstrapping now. > > /home/dnovillo/perf/sbox/lto/local.x86_64/src/gcc/lto-function-in.c:1984: > error: request for implicit conversion from 'void *' to 'union > tree_node **' not permitted in C++ > make[3]: *** [lto-function-in.o] Error 1 > make[3]: *** Waiting for unfinished jobs > rm gcj-dbtool.pod fsf-funding.pod jcf-dump.pod jv-convert.pod > grmic.pod gcov.pod gcj.pod gc-analyze.pod gfdl.pod cpp.pod gij.pod > gcc.pod gfortran.pod > make[3]: Leaving directory > `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld/gcc' > make[2]: *** [all-stage2-gcc] Error 2 > make[2]: Leaving directory > `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld' > make[1]: *** [stage2-bubble] Error 2 > make[1]: Leaving directory > `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld' > make: *** [bootstrap] Error 2 > > Thanks. Diego. > Cheers, -- Rafael Avila de Espindola Google Ireland Ltd. Gordon House Barrow Street Dublin 4 Ireland Registered in Dublin, Ireland Registration Number: 368047
The Linux binutils 2.18.50.0.8 is released
This is the beta release of binutils 2.18.50.0.8 for Linux, which is based on binutils 2008 0709 in CVS on sourceware.org plus various changes. It is purely for Linux. All relevant patches in patches have been applied to the source tree. You can take a look at patches/README to see what have been applied and in what order they have been applied. Starting from the 2.18.50.0.4 release, the x86 assembler no longer accepts fnstsw %eax fnstsw stores 16bit into %ax and the upper 16bit of %eax is unchanged. Please use fnstsw %ax Starting from the 2.17.50.0.4 release, the default output section LMA (load memory address) has changed for allocatable sections from being equal to VMA (virtual memory address), to keeping the difference between LMA and VMA the same as the previous output section in the same region. For .data.init_task : { *(.data.init_task) } LMA of .data.init_task section is equal to its VMA with the old linker. With the new linker, it depends on the previous output section. You can use .data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) } to ensure that LMA of .data.init_task section is always equal to its VMA. The linker script in the older 2.6 x86-64 kernel depends on the old behavior. You can add AT (ADDR(section)) to force LMA of .data.init_task section equal to its VMA. It will work with both old and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and above is OK. The new x86_64 assembler no longer accepts monitor %eax,%ecx,%edx You should use monitor %rax,%ecx,%edx or monitor which works with both old and new x86_64 assemblers. They should generate the same opcode. The new i386/x86_64 assemblers no longer accept instructions for moving between a segment register and a 32bit memory location, i.e., movl (%eax),%ds movl %ds,(%eax) To generate instructions for moving between a segment register and a 16bit memory location without the 16bit operand size prefix, 0x66, mov (%eax),%ds mov %ds,(%eax) should be used. It will work with both new and old assemblers. The assembler starting from 2.16.90.0.1 will also support movw (%eax),%ds movw %ds,(%eax) without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are available at http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch The ia64 assembler is now defaulted to tune for Itanium 2 processors. To build a kernel for Itanium 1 processors, you will need to add ifeq ($(CONFIG_ITANIUM),y) CFLAGS += -Wa,-mtune=itanium1 AFLAGS += -Wa,-mtune=itanium1 endif to arch/ia64/Makefile in your kernel source tree. Please report any bugs related to binutils 2.18.50.0.8 to [EMAIL PROTECTED] and http://www.sourceware.org/bugzilla/ Changes from binutils 2.18.50.0.7: 1. Update from binutils 2008 0709. 2. Allow vmovd with 64bit operand in x86 assembler. 3. Improve -msse-check in x86 assembler. 4. Fix an AVX assembler bug in Intel syntax. PR 6517. 5. Improve error message in Intel syntax for x86 assembler. PR 6518. 6. Add the ".sse_check" directive to x86 assembler. 7. Improve gold. 8. Improve objcopy/strip. PR 2995/6473. 9. Improve objdump -g. PR 6483. 10. Improve ld --sort-common. PR 6430. 11. Add multi-GOT support for m68k. 12. Fix various arm bugs. 13. Fix various avr bugs. 14. Fix various hppa bugs. 15. Fix various m68k bugs. 16. Fix various mips bugs. 17. Fix various mmix bugs. 18. Fix various ppc bugs. 19. Fix various spu bugs. 20. Fix various xtensa bugs. Changes from binutils 2.18.50.0.6: 1. Update from binutils 2008 0502. 2. Add Intel EPT and MOVBE support. 3. Correct Intel FMA operand order. 4. Change Intel CLMUL to Intel PCLMUL. 5. Add -msse-check to x86 assembler to warn SSE instruction where there is AVX equivalent. 6. Provide backward compatibility for ELF object files with more than 64K sections generated by the older binutils. PR 6412. 7. Improve FDPIC support. 8. Add -wL switch to readelf to dump decoded contents of .debug_line. 9. Add -ag switch to assembler show general information in listings. 10. Improve objcopy symbol filtering performance. PR 6034. 12. Correct think archive support. 13. Improve ELF/Sparc support. 14. Fix various mips bugs. 15. Fix various sh bugs. 16. Fix various spu bugs. Changes from binutils 2.18.50.0.5: 1. Update from binutils 2008 0403. 2. Add Intel AES, CLMUL, AVX/FMA support. 3. Improve error handling in x86 linker for undefined hidden/internal symbols when building a shared object. PR ld/5789/5943. 4. Add a new ELF linker, gold. 5. Add think archive support. 6. Fix various arm bugs. 7. Fix various avr bugs. 8. Fix various bfin bugs. 9. Fix various hppa bugs. 10. Fix various m68k bugs. 11. Fix various mips bugs. 12. Fix various s390 bugs. 13. Fix various spu bugs. Changes from binutils 2.18.50.0.4: 1. Update from binutils 2008 0314. 2. Add Intel XSAVE new instructio
gcc-4.3-20080710 is now available
Snapshot gcc-4.3-20080710 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20080710/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.3 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_3-branch revision 137704 You'll find: gcc-4.3-20080710.tar.bz2 Complete GCC (includes all of below) gcc-core-4.3-20080710.tar.bz2 C front end and core compiler gcc-ada-4.3-20080710.tar.bz2 Ada front end and runtime gcc-fortran-4.3-20080710.tar.bz2 Fortran front end and runtime gcc-g++-4.3-20080710.tar.bz2 C++ front end and runtime gcc-java-4.3-20080710.tar.bz2 Java front end and runtime gcc-objc-4.3-20080710.tar.bz2 Objective-C front end and runtime gcc-testsuite-4.3-20080710.tar.bz2The GCC testsuite Diffs from 4.3-20080703 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.3 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Failure building GFortran (Cygwin)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 According to Jerry DeLisle on 6/29/2008 11:45 AM: | Ian Lance Taylor wrote: |> CC'ed to Eric. This may require some configury patches somewhere. |> | Adjust strsignal to POSIX 200x prototype. | * strsignal.c (strsignal): Remove const. |>>> You may need to build and install cygwin from CVS[*] to get the |>>> corresponding newlib fix and install it into your system headers in |>>> /usr/include. Or you could patch your /usr/include/string.h |>>> locally. |>>> |> | A PR should be opened for this. Has that been done? Is it marked as a | regression and as a blocker? Sorry for the delay in responding; this mail landed in my inbox while I was on vacation. An even more fundamental question (but one that still needs a PR, if you haven't created one yet) - when building libiberty on cygwin, why does it even trying to compile a strsignal replacement, since cygwin has already been providing strsignal implemented in C++ for years? ~ In other words, there should be no need to compile libiberty's strsignal.c, thus it should not matter whether you are using the 1.5.x (broken) string.h, or the 1.7.0 (fixed) prototype. | | Who has responsibility to fix this? Unfortunately, I've never compiled Fortran myself, and my experience with libiberty is very limited. I don't know why the libiberty configury isn't picking up on the fact that cygwin already has strsignal. - -- Don't work too hard, make some time for fun as well! Eric Blake [EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkh20eEACgkQ84KuGfSFAYA4NQCguIOOAziyclXoAf94fsy2VLKW uKsAnA6nfl2WMAmO81gmEpJ96mgelEHc =hsQb -END PGP SIGNATURE-