Re: Inefficient loop unrolling.

2008-07-10 Thread Paolo Bonzini

Bingfeng Mei wrote:

Steven,
I just created a bug report. You should receive a CCed mail now.

I can see these issues are solvable at RTL-level, but require lots of
efforts. The main optimization in loop unrolling pass, split iv, can
reduce dependence chain but not extra ADDs and alias issue. What is the
main reason that loop unrolling should belong to RTL level? Is it
fundamental?


No, it is just effectiveness of the code size expansion heuristics. 
Ivopts is already complex enough on the tree level, that doing it on RTL 
would be insane.  But other low-level loop optimizations had already 
been written on the RTL level and since there were no compelling 
reasons, they were left there.


That said, this is a bug -- fwprop should have folded the ADDs, at the 
very least.  I'll look at the PR.


Paolo


RE: Inefficient loop unrolling.

2008-07-10 Thread Bingfeng Mei
Paolo,
Thanks for the reply. However, I am not sure it is a simple folding
issue. 

For example, 

B1 = B + 4;
 = [A, B1]
B2 = B + 8;
 = [A, B2] 
B3 = B + 12;
 = [A, B3]

Should be transformed to 
C = A + B
= [C, 4]
= [C, 8]
= [C, 12]

Loop exit condition needs to be changed accordingly. 

BTW, I just added an experimental tree-level loop unrolling pass in my
porting, right before ivopt pass. The results are very promising except
a few quirky things, which I belive to be problem of ivopts. The
produced assembly code is as good as maunal unrolling now. 

Cheers,
Bingfeng


-Original Message-
From: Paolo Bonzini [mailto:[EMAIL PROTECTED] On Behalf Of Paolo
Bonzini
Sent: 10 July 2008 13:34
To: Bingfeng Mei
Cc: Steven Bosscher; gcc@gcc.gnu.org
Subject: Re: Inefficient loop unrolling.

Bingfeng Mei wrote:
> Steven,
> I just created a bug report. You should receive a CCed mail now.
> 
> I can see these issues are solvable at RTL-level, but require lots of
> efforts. The main optimization in loop unrolling pass, split iv, can
> reduce dependence chain but not extra ADDs and alias issue. What is
the
> main reason that loop unrolling should belong to RTL level? Is it
> fundamental?

No, it is just effectiveness of the code size expansion heuristics. 
Ivopts is already complex enough on the tree level, that doing it on RTL

would be insane.  But other low-level loop optimizations had already 
been written on the RTL level and since there were no compelling 
reasons, they were left there.

That said, this is a bug -- fwprop should have folded the ADDs, at the 
very least.  I'll look at the PR.

Paolo




[lto] Bootstrap failure

2008-07-10 Thread Diego Novillo
Is this the bootstrap failure that you folks were discussing in
another thread? Is anyone fixing this?


/home/dnovillo/perf/sbox/lto/local.x86_64/src/gcc/lto-function-in.c:1984:
error: request for implicit conversion from 'void *' to 'union
tree_node **' not permitted in C++
make[3]: *** [lto-function-in.o] Error 1
make[3]: *** Waiting for unfinished jobs
rm gcj-dbtool.pod fsf-funding.pod jcf-dump.pod jv-convert.pod
grmic.pod gcov.pod gcj.pod gc-analyze.pod gfdl.pod cpp.pod gij.pod
gcc.pod gfortran.pod
make[3]: Leaving directory
`/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld/gcc'
make[2]: *** [all-stage2-gcc] Error 2
make[2]: Leaving directory
`/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld'
make[1]: *** [stage2-bubble] Error 2
make[1]: Leaving directory
`/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld'
make: *** [bootstrap] Error 2

Thanks.  Diego.


Re: [lto] Bootstrap failure

2008-07-10 Thread Rafael Espindola
2008/7/10 Diego Novillo <[EMAIL PROTECTED]>:
> Is this the bootstrap failure that you folks were discussing in
> another thread? Is anyone fixing this?

I just committed Bill's patch with some small modifications. Should be
bootstrapping now.

>
> /home/dnovillo/perf/sbox/lto/local.x86_64/src/gcc/lto-function-in.c:1984:
> error: request for implicit conversion from 'void *' to 'union
> tree_node **' not permitted in C++
> make[3]: *** [lto-function-in.o] Error 1
> make[3]: *** Waiting for unfinished jobs
> rm gcj-dbtool.pod fsf-funding.pod jcf-dump.pod jv-convert.pod
> grmic.pod gcov.pod gcj.pod gc-analyze.pod gfdl.pod cpp.pod gij.pod
> gcc.pod gfortran.pod
> make[3]: Leaving directory
> `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld/gcc'
> make[2]: *** [all-stage2-gcc] Error 2
> make[2]: Leaving directory
> `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld'
> make[1]: *** [stage2-bubble] Error 2
> make[1]: Leaving directory
> `/usr/local/google/dnovillo/perf/sbox/lto/local.x86_64/bld'
> make: *** [bootstrap] Error 2
>
> Thanks.  Diego.
>


Cheers,
-- 
Rafael Avila de Espindola

Google Ireland Ltd.
Gordon House
Barrow Street
Dublin 4
Ireland

Registered in Dublin, Ireland
Registration Number: 368047


The Linux binutils 2.18.50.0.8 is released

2008-07-10 Thread H.J. Lu
This is the beta release of binutils 2.18.50.0.8 for Linux, which is
based on binutils 2008 0709 in CVS on sourceware.org plus various
changes. It is purely for Linux.

All relevant patches in patches have been applied to the source tree.
You can take a look at patches/README to see what have been applied and
in what order they have been applied.

Starting from the 2.18.50.0.4 release, the x86 assembler no longer
accepts

fnstsw %eax

fnstsw stores 16bit into %ax and the upper 16bit of %eax is unchanged.
Please use

fnstsw %ax

Starting from the 2.17.50.0.4 release, the default output section LMA
(load memory address) has changed for allocatable sections from being
equal to VMA (virtual memory address), to keeping the difference between
LMA and VMA the same as the previous output section in the same region.

For

.data.init_task : { *(.data.init_task) }

LMA of .data.init_task section is equal to its VMA with the old linker.
With the new linker, it depends on the previous output section. You
can use

.data.init_task : AT (ADDR(.data.init_task)) { *(.data.init_task) }

to ensure that LMA of .data.init_task section is always equal to its
VMA. The linker script in the older 2.6 x86-64 kernel depends on the
old behavior.  You can add AT (ADDR(section)) to force LMA of
.data.init_task section equal to its VMA. It will work with both old
and new linkers. The x86-64 kernel linker script in kernel 2.6.13 and
above is OK.

The new x86_64 assembler no longer accepts

monitor %eax,%ecx,%edx

You should use

monitor %rax,%ecx,%edx

or
monitor

which works with both old and new x86_64 assemblers. They should
generate the same opcode.

The new i386/x86_64 assemblers no longer accept instructions for moving
between a segment register and a 32bit memory location, i.e.,

movl (%eax),%ds
movl %ds,(%eax)

To generate instructions for moving between a segment register and a
16bit memory location without the 16bit operand size prefix, 0x66,

mov (%eax),%ds
mov %ds,(%eax)

should be used. It will work with both new and old assemblers. The
assembler starting from 2.16.90.0.1 will also support

movw (%eax),%ds
movw %ds,(%eax)

without the 0x66 prefix. Patches for 2.4 and 2.6 Linux kernels are
available at

http://www.kernel.org/pub/linux/devel/binutils/linux-2.4-seg-4.patch
http://www.kernel.org/pub/linux/devel/binutils/linux-2.6-seg-5.patch

The ia64 assembler is now defaulted to tune for Itanium 2 processors.
To build a kernel for Itanium 1 processors, you will need to add

ifeq ($(CONFIG_ITANIUM),y)
CFLAGS += -Wa,-mtune=itanium1
AFLAGS += -Wa,-mtune=itanium1
endif

to arch/ia64/Makefile in your kernel source tree.

Please report any bugs related to binutils 2.18.50.0.8 to
[EMAIL PROTECTED]

and

http://www.sourceware.org/bugzilla/

Changes from binutils 2.18.50.0.7:

1. Update from binutils 2008 0709.
2. Allow vmovd with 64bit operand in x86 assembler.
3. Improve -msse-check in x86 assembler.
4. Fix an AVX assembler bug in Intel syntax.  PR 6517.
5. Improve error message in Intel syntax for x86 assembler.  PR 6518.
6. Add the ".sse_check" directive to x86 assembler.
7. Improve gold.
8. Improve objcopy/strip.  PR 2995/6473.
9. Improve objdump -g.  PR 6483.
10. Improve ld --sort-common.  PR 6430.
11. Add multi-GOT support for m68k.
12. Fix various arm bugs.
13. Fix various avr bugs.
14. Fix various hppa bugs.
15. Fix various m68k bugs.
16. Fix various mips bugs.
17. Fix various mmix bugs.
18. Fix various ppc bugs.
19. Fix various spu bugs.
20. Fix various xtensa bugs.

Changes from binutils 2.18.50.0.6:

1. Update from binutils 2008 0502.
2. Add Intel EPT and MOVBE support.
3. Correct Intel FMA operand order.
4. Change Intel CLMUL to Intel PCLMUL.
5. Add -msse-check to x86 assembler to warn SSE instruction where
there is AVX equivalent.
6. Provide backward compatibility for ELF object files with more
than 64K sections generated by the older binutils.  PR 6412.
7. Improve FDPIC support.
8. Add -wL switch to readelf to dump decoded contents of .debug_line.
9. Add -ag switch to assembler show general information in listings.
10. Improve objcopy symbol filtering performance.  PR 6034.
12. Correct think archive support.
13. Improve ELF/Sparc support.
14. Fix various mips bugs.
15. Fix various sh bugs.
16. Fix various spu bugs.

Changes from binutils 2.18.50.0.5:

1. Update from binutils 2008 0403.
2. Add Intel AES, CLMUL, AVX/FMA support.
3. Improve error handling in x86 linker for undefined hidden/internal
symbols when building a shared object.  PR ld/5789/5943.
4. Add a new ELF linker, gold.
5. Add think archive support.
6. Fix various arm bugs.
7. Fix various avr bugs.
8. Fix various bfin bugs.
9. Fix various hppa bugs.
10. Fix various m68k bugs.
11. Fix various mips bugs.
12. Fix various s390 bugs.
13. Fix various spu bugs.

Changes from binutils 2.18.50.0.4:

1. Update from binutils 2008 0314.
2. Add Intel XSAVE new instructio

gcc-4.3-20080710 is now available

2008-07-10 Thread gccadmin
Snapshot gcc-4.3-20080710 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20080710/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.3 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_3-branch 
revision 137704

You'll find:

gcc-4.3-20080710.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.3-20080710.tar.bz2 C front end and core compiler

gcc-ada-4.3-20080710.tar.bz2  Ada front end and runtime

gcc-fortran-4.3-20080710.tar.bz2  Fortran front end and runtime

gcc-g++-4.3-20080710.tar.bz2  C++ front end and runtime

gcc-java-4.3-20080710.tar.bz2 Java front end and runtime

gcc-objc-4.3-20080710.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.3-20080710.tar.bz2The GCC testsuite

Diffs from 4.3-20080703 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.3
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Failure building GFortran (Cygwin)

2008-07-10 Thread Eric Blake

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

According to Jerry DeLisle on 6/29/2008 11:45 AM:
| Ian Lance Taylor wrote:
|> CC'ed to Eric.  This may require some configury patches somewhere.
|>
| Adjust strsignal to POSIX 200x prototype.
| * strsignal.c (strsignal): Remove const.
|>>>   You may need to build and install cygwin from CVS[*] to get the
|>>> corresponding newlib fix and install it into your system headers in
|>>> /usr/include.  Or you could patch your /usr/include/string.h
|>>> locally.
|>>>
|>
| A PR should be opened for this. Has that been done?  Is it marked as a
| regression and as a blocker?

Sorry for the delay in responding; this mail landed in my inbox while I
was on vacation.  An even more fundamental question (but one that still
needs a PR, if you haven't created one yet) - when building libiberty on
cygwin, why does it even trying to compile a strsignal replacement, since
cygwin has already been providing strsignal implemented in C++ for years?
~ In other words, there should be no need to compile libiberty's
strsignal.c, thus it should not matter whether you are using the 1.5.x
(broken) string.h, or the 1.7.0 (fixed) prototype.

|
| Who has responsibility to fix this?

Unfortunately, I've never compiled Fortran myself, and my experience with
libiberty is very limited.  I don't know why the libiberty configury isn't
picking up on the fact that cygwin already has strsignal.

- --
Don't work too hard, make some time for fun as well!

Eric Blake [EMAIL PROTECTED]
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkh20eEACgkQ84KuGfSFAYA4NQCguIOOAziyclXoAf94fsy2VLKW
uKsAnA6nfl2WMAmO81gmEpJ96mgelEHc
=hsQb
-END PGP SIGNATURE-