Where can I put the optimization of got for arm back end at?

2010-03-28 Thread Carrot Wei
Hi

The detailed description of the optimization is at
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43129. This is an ARM
specific optimization.

This optimization uses one less register (the register hold the GOT
base), to get this beneficial the ideal place for it should be before
register allocation.

Usually expand pass generates instructions to load global variable's
address from GOT entry for each access of the global variable. Later
cse/gcse passes can remove many of them. In order to precisely model
the cost, this optimization should be put after some cse/gcse passes.

So what is the best place for this optimization? Is there any existed
pass can be enhanced with this optimization? Or should I add a new
pass?

thanks
Guozhi


how difficult to make UNITS_PER_WORD 1

2010-03-28 Thread Dave Milter
Hello,

I try to make estimation of time and efforts to implement support of
one CPU in gcc 4.x.

The main problem, as can I see, that CPU can only access memory  using
 32bit word as unit.

I try to play with ggx:
http://spindazzle.org/ggx/

and force it to work with memory like above mentioined CPU:

#define UNITS_PER_WORD 1
#define BITS_PER_UNIT 32
#define BITS_PER_WORD 32
#define POINTER_SIZE 32

but this cause a segfault in builtin_define_float_constants.

The old implementation for this CPU based on 3.3.x branch uses trick like this:
#define QImode SImode

but now it does not work.

I define Pmode to QImode but this also not helps, it still catch segfault.

Can anybody with deep knowledge in gcc, describe a little,
the main steps to force compiler generate code for UINTS_PER_WORD==1 target?


GSoC 2010 Project Idea

2010-03-28 Thread Артем Шинкаров
Hi,

I have a project in mind which I'm going to propose to the GCC in terms of
Google Summer of Code. My project is not on the list of project ideas
(http://gcc.gnu.org/wiki/SummerOfCode) that is why it would be very interesting
for me to hear any opinions and maybe even to find a mentor.


1. Project idea

A brief project idea is to create an abstract layer for vectorized
computations. This would allow to write a portable vectorized code.


2. State of the art

Nowadays most of processors have a support for SIMD computations. However, the
problem is that each hardware has a different set of SIMD instructions: Intel
MMX+SSE+AVX, PowerPC Altivec, ARM iWMMXt, and so on. GCC supports most of
architecture-specific instructions providing built-in functions. It is
considerably convenient to use these functions when you want to optimize some
piece of code. The problem starts when you want to make this code portable.
It is not a very common task, and of course GCC has a vectorizer.
Unfortunately, there are many examples which show that it is relatively simple
for a human to find a right  place in the code and vectorize it, but it is
extremely hard for the compiler to do the same. As a result we end up with the
code which is not using the capabilities of the architecture.
It would be much easier for the programmer to use an abstract layer to
implement a vectorized code. A compiler should deal with the portability issues
dispatching the code from the abstract layer to the particular architecture. My
experience shows that there are no such a library for C/C++ that could solve
the problem. There are some attempts like: http://libsimd.sourceforge.net/ but
it is only a small part of the idea, and unfortunately the development is
suspended. Or maybe I am wrong and everything is already written?


3. Implementation

First we need to introduce the SIMD abstract model functionality which can be
mapped  to the set of architectures we want to support. The difficulty is that
SIMD instruction sets from different architectures are not fully compatible.
Then we want to write a set of "fake-SIMD" functions to be sure that our code
will be usable within the architecture without SIMD support.
After that there is a question how to dispatch functions from the abstract
layer to the architecture layer. The trivial thing to do is just to map the
abstract layer functions to the built-in functions. Obviously it would not give
the best performance. For example, loading the data from the unaligned memory
into the SIMD register is much slower than loading the data from the aligned
memory. Altivec has an instruction vec_madd(a,b,c) which can be represented by
two instructions in SSE case: _mm_add_ps( _mm_mul_ps(a,b), c). It means that
some code optimizations are required.


4. Time constraints

The GSoC gives 4 month to finish the project. It means that the
timeline could be the following:
2 weeks -- discussions and design
1 week  -- fake SIMD
3 weeks -- implementation of the main dispatcher
2 weeks -- benchmarks and testing
* the first submission
1.5 month -- architecture specific dispatcher optimizations
0.5 month  -- testing
* the second submission

This project can be continued in various ways:
1) Cost model for the dispatcher
2) Auto vectorizer + dispatcher
3) Integration with other languages
And so on


5. Questions

Should it be the library or the part of the language? What about the extensions
of this abstract layer with a respect to the Larrabee (or similar) which
provides 512-bit register for vectorized operations? And so on.
These questions should be discussed considering the project time constraints
and the interest of the GCC. If anybody is interested in mentoring such a
project please let me know and I would be happy to discuss all the issues. If
anybody thinks that the project is hopeless, please let me know as well.

--
Best regards,
Artem Shinkarov
Compiler Technology and Computer Architecture Group
University of Hertfordshire


gcc-4.3-20100328 is now available

2010-03-28 Thread gccadmin
Snapshot gcc-4.3-20100328 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.3-20100328/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.3 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_3-branch 
revision 157786

You'll find:

gcc-4.3-20100328.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.3-20100328.tar.bz2 C front end and core compiler

gcc-ada-4.3-20100328.tar.bz2  Ada front end and runtime

gcc-fortran-4.3-20100328.tar.bz2  Fortran front end and runtime

gcc-g++-4.3-20100328.tar.bz2  C++ front end and runtime

gcc-java-4.3-20100328.tar.bz2 Java front end and runtime

gcc-objc-4.3-20100328.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.3-20100328.tar.bz2The GCC testsuite

Diffs from 4.3-20100321 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.3
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Stepping down as SPU backend maintainer

2010-03-28 Thread Andrew Pinski
Hi,
  I am stepping down as the spu backend maintainer since Sony removed
GNU/Linux (OtherOS) support from their newer PS3 firmware.  The main
reason is I will no longer have access to a machine to support the
target.  But really this is also a step backwards for free software
support from Sony.

Thanks,
Andrew Pinski

ChangeLog:

* MAINTAINERS (spu port): Remove me.
Index: MAINTAINERS
===
--- MAINTAINERS (revision 157742)
+++ MAINTAINERS (working copy)
@@ -95,7 +95,6 @@
 sparc port Jakub Jelinek   ja...@redhat.com
 sparc port Eric Botcazou   ebotca...@libertysurf.fr
 spu port   Trevor Smigiel  
trevor_smig...@playstation.sony.com
-spu port   Andrew Pinski   pins...@gmail.com
 spu port   David Edelsohn  edels...@gnu.org
 v850 port  Nick Cliftonni...@redhat.com
 vax port   Matt Thomas m...@3am-software.com