Re: Hints for backporting gcc 4.5 powerpc fix to gcc 4.4.3?

2011-03-24 Thread Simon Baldwin
On 22 March 2011 14:56, David Edelsohn  wrote:
>
> On Tue, Mar 22, 2011 at 9:25 AM, Simon Baldwin  wrote:
> > I'm currently trying to backport a small part of gcc 4.5 r151729 to
> > gcc 4.4.3.  This revision fixes a problem in powerpc code generation
> > that leads to gcc not using lmw/stmw instructions in function prologue
> > and epilogues, where it could otherwise validly use them.
> >
> > On the face of things, the central piece of r151729 I seem to want is just 
> > this:
> >
> > Index: gcc/config/rs6000/rs6000.c
> > ===
> > --- gcc/config/rs6000/rs6000.c  (revision 151728)
> > +++ gcc/config/rs6000/rs6000.c  (revision 151729)
> > @@ -18033,7 +18033,8 @@ static bool
> >  no_global_regs_above (int first, bool gpr)
> >  {
> >   int i;
> > -  for (i = first; i < gpr ? 32 : 64 ; i++)
> > +  int last = gpr ? 32 : 64;
> > +  for (i = first; i < last; i++)
> >     if (global_regs[i])
> >       return false;
> >   return true;
> >
> > Taking only that and leaving out all of the rest of r151729 lets me
> > build a powerpc gcc that does use lmw/stmw instructions in function
> > prologue and epilogues as hoped.  Unfortunately it also has bad
> > codegen elsewhere.  So it seems I need more than just this little
> > piece of r151729.  Unfortunately, r151729 is a fairly large patch that
> > seems to do a number of jobs and which does not apply readily to gcc
> > 4.4.  At the moment it's not clear to me what other parts of it I
> > might need.
> >
> > Can anyone here offer any hints or pointers on how to extract from the
> > r151729 diff just the few pieces needed to fix this single powerpc
> > codegen bug in gcc 4.4.3?  Anyone recognize this issue and already
> > dealt with it in isolation?
>
> The change to no_global_regs_above() is one of the key pieces, but
> that change exposed other latent bugs, as you have encountered.  One
> needs the additional patches to the save/restore strategy routines and
> prologue/epilogue.  This is why the entire patch was committed in one
> piece.

Thanks for the reply, David.  I'll take another look and see if I can
abstract out just the required pieces.  In practice, though, it looks
like it may be easier for me to just upgrade to gcc 4.5 or 4.6.
Certainly safer.

--
Google UK Limited | Registered Office: Belgrave House, 76 Buckingham
Palace Road, London SW1W 9TQ | Registered in England Number: 3977902


Modifying instruction flow during scheduling

2011-03-24 Thread Frederic Riss
Hi,

I would like to experiment with modifications to the instruction flow
during scheduling. One motivation for doing that is the combining of
contiguous loads like was discussed here:
http://gcc.gnu.org/ml/gcc/2010-12/msg00153.html

I've seen that the scheduler itself does some modifications to the
intruction flow to introduce the speculative form of the instructions,
but is it somehow prepared to modifications to the instructions from
the target hooks (TARGET_SCHED_REORDER) ? If yes, what are the
primitives that one should use to notify modifications? If no, what
would it take to make that possible?

Thanks,
Fred


Complex vectorization

2011-03-24 Thread Simon Chopin
Hi,

I'm currently working on trying to implement a way to use the SIMD
instructions of the SSEx family when computing a vector of complex
numbers.

I have to say that I have never worked on compilation techniques before,
and that I only have little understanding of the vectorization problems.

I've spent a fait amount of time reading documentation and code, and I
came to the conclusion that, at least for the multiplication and
division of complex numbers, I had to implement them as functions in the
libgcc as their scalar counterpart, __mul*c3 and __div*c3.

I face a couple of issues here : what are the C types corresponding to
the vector types, assuming they exist ?

And, also important : I understand that the processor has to be in a
certain state to use the SIMD instructions. Will it stay that way when
calling a function, thus changing partially the environment ?

Have a nice day,

Simon Chopin



signature.asc
Description: Digital signature


Re: Complex vectorization

2011-03-24 Thread Richard Guenther
On Thu, Mar 24, 2011 at 12:41 PM, Simon Chopin
 wrote:
> Hi,
>
> I'm currently working on trying to implement a way to use the SIMD
> instructions of the SSEx family when computing a vector of complex
> numbers.
>
> I have to say that I have never worked on compilation techniques before,
> and that I only have little understanding of the vectorization problems.
>
> I've spent a fait amount of time reading documentation and code, and I
> came to the conclusion that, at least for the multiplication and
> division of complex numbers, I had to implement them as functions in the
> libgcc as their scalar counterpart, __mul*c3 and __div*c3.
>
> I face a couple of issues here : what are the C types corresponding to
> the vector types, assuming they exist ?

There are no vector of complex types and GCC internally does not handle
this case as well.  Instead GCC lowers complex operations to
piecewise scalar operations, thus vectorization would have vectors
of the complex components.  There are a number of bugs in bugzilla
for complex vectorization, like PR37021 or PR40770.

Richard.


gcno file question

2011-03-24 Thread Joel Sherrill

Hi,

RTEMS has been using simulators and some
programs we wrote for coverage analysis
for a while now.

I am looking into writing a converter which
takes coverage data from simulators and produces
.gcno files.  The coverage data is often just
a bitmap of which addresses were executed.  There
is no frequency, just yes/no.  We can already
map that information back to file/line.

+ Is this enough to produce a .gcno file from?

+ What records need to be generated as a minimum?

As a technical sanity question, the RTEMS code
is in a library and we are merging coverage
data from multiple executables to get unified
coverage data.  We abstract away physical
address into offsets into methods and file/line.
Does generating a .gcno from this merged data
sound feasible?

Thoughts, insights, comments appreciated.

Thanks.

--
Joel Sherrill, Ph.D. Director of Research&  Development
joel.sherr...@oarcorp.comOn-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
   Support Available (256) 722-9985




Re: inline assembly vs. intrinsic functions

2011-03-24 Thread roy rosen
2011/3/22 Ian Lance Taylor :
> roy rosen  writes:
>
>> 2010/10/26 Ian Lance Taylor :
>>> roy rosen  writes:
>>>
 I am trying to demonstrate my port capabilities.
 I am writing an application which needs to use instructions like max
 a,b,c,d,e,f where a,b,c are inputs and d,e,f are outputs.
 Is that possible to write an intrinsic function for that?
 I think not because that means that I need to pass d,e,f by reference
 which means that they would be in memory and not in a register as
 meant by the instruction.
>>>
>>> That is correct.  An intrinsic function is a normal function.  If you
>>> want it to have multiple outputs, you need to pass in addresses, or you
>>> need to have it return a struct.
>>>
>>> I'm a bit curious as to why a function named max would have multiple
>>> outputs.
>>>
 Is there any port with such an example?
>>>
>>> Not to my knowledge.  I wrote a private port in which some intrinsics
>>> returned a struct, and to keep everything out of memory I added
>>> additional intrinsics to retrieve elements of the struct.  It's awkward
>>> to use but the resulting code is fine.
>>>
>> Can you please explain how this solution should work?
>> First a code with memory accesses would be generated and then
>> optimizations would optimize it to use registers directly?
>
> You build a RECORD_TYPE holding the fields you want to return.  You
> define the appropriate builtin functions to return that record type.

How is that done? using define_insn? How do I tell it to return a struct?
Is there an example I can look at?

Roy.

> You define another builtin function for each field, which takes the
> RECORD_TYPE as its argument and returns the type of the field.  In
> TARGET_FOLD_BUILTIN you convert the per-field functions into
> COMPONENT_REFs.
>
> Ian
>


Re: Complex vectorization

2011-03-24 Thread Simon Chopin
On Thu, Mar 24, 2011 at 12:55:44PM +0100, Richard Guenther wrote:
> There are no vector of complex types and GCC internally does not handle
> this case as well.  Instead GCC lowers complex operations to
Yep, sorry, my mistake. I meant array of complex.
> piecewise scalar operations, thus vectorization would have vectors
> of the complex components.  There are a number of bugs in bugzilla
> for complex vectorization, like PR37021 or PR40770.

And yet, when trying to multiply numbers, gcc says that complex isn't a
supported type. From the links you provided, part of the solution would
be to associate the complex type and the vector type of its scalar type.
While it should work for most operations, providing support for the
IMAGPART_EXPR and REALPART_EXPR, the multiplication and division
operations are implemented as separated functions of libgcc. Because of
that, they wouldn't gain from the vectorization, or I am mistaken
(again) ?

Cheers,

Simon

P.S. I am not aware of the list policy regarding the CCs, but I assumed
you were already subscribed.


Re: Complex vectorization

2011-03-24 Thread Richard Guenther
On Thu, Mar 24, 2011 at 2:50 PM, Simon Chopin
 wrote:
> On Thu, Mar 24, 2011 at 12:55:44PM +0100, Richard Guenther wrote:
>> There are no vector of complex types and GCC internally does not handle
>> this case as well.  Instead GCC lowers complex operations to
> Yep, sorry, my mistake. I meant array of complex.
>> piecewise scalar operations, thus vectorization would have vectors
>> of the complex components.  There are a number of bugs in bugzilla
>> for complex vectorization, like PR37021 or PR40770.
>
> And yet, when trying to multiply numbers, gcc says that complex isn't a
> supported type.

_Complex float should work.  Complex is a c99 feature and requires
you to include complex.h

> From the links you provided, part of the solution would
> be to associate the complex type and the vector type of its scalar type.
> While it should work for most operations, providing support for the
> IMAGPART_EXPR and REALPART_EXPR, the multiplication and division
> operations are implemented as separated functions of libgcc. Because of
> that, they wouldn't gain from the vectorization, or I am mistaken
> (again) ?

Multiplication is inlined for -fcx-fortran-rules for example.  Yes, division
is always out-of-line.

Richard.

> Cheers,
>
> Simon
>
> P.S. I am not aware of the list policy regarding the CCs, but I assumed
> you were already subscribed.
>


Re: mov arguments are still the same

2011-03-24 Thread Paulo J. Matos

Let me revive this thread and ask for suggestions/tips on the issue below.

Cheers,

PMatos

On 16/03/11 18:19, Paulo J. Matos wrote:

Hi,

I have touched this subject before:
http://thread.gmane.org/gmane.comp.gcc.devel/116198/focus=116200

Now, at the time I didn't pursue this issue but now with 4.4.4 this
keeps happening and I traced it to the cprop_hardreg replacing a
register which makes the set insn having the same source and dest.

Here's insn 32 from pass 183:ce3

(insn 32 31 33 4 h.c:51 (set (reg:QI 1 AL)
(reg/f:QI 8 @H'fff9 [33])) 4 {*movqi} (expr_list:REG_DEAD (reg/f:QI 8
@H'fff9 [33])
(expr_list:REG_EQUAL (plus:QI (reg/f:QI 6 Y)
(const_int 1 [0x1]))
(nil


Now the same insn after the following pass 185:cprop_hardreg


insn 32: replaced reg 8 with 1

...

(insn 32 31 33 4 h.c:51 (set (reg:QI 1 AL)
(reg/f:QI 1 AL [33])) 4 {*movqi} (expr_list:REG_DEAD (reg/f:QI 8 @H'fff9
[33])
(expr_list:REG_EQUAL (plus:QI (reg/f:QI 6 Y)
(const_int 1 [0x1]))
(nil


This stays as is until assembler generation, which is really annoying
cause it generates an instruction (which is basically a nop) like:
...
mov AL,AL
...

Is this a known issue? I can't see how this is a problem with my backend
but might as well be. Maybe cprop_regmove should check if it is making
the src equal to dest and if it is remove the insn?

Any suggestions?

--
PMatos







Re: inline assembly vs. intrinsic functions

2011-03-24 Thread Ian Lance Taylor
roy rosen  writes:

>> You build a RECORD_TYPE holding the fields you want to return.  You
>> define the appropriate builtin functions to return that record type.
>
> How is that done? using define_insn? How do I tell it to return a struct?
> Is there an example I can look at?

A RECORD_TYPE is what gcc generates when you define a struct in your
source code.  For an example of a backend building a struct, see, e.g.,
ix86_build_builtin_va_list_abi.

When you define your builtin functions in TARGET_INIT_BUILTINS you
specify the argument types and the return type, typically by building a
FUNCTION_TYPE and passing it to add_builtin_function.  To define a
builtin which returns a struct, just arrange for the return type of the
FUNCTION_TYPE that you pass to add_builtin_function be the RECORD_TYPE
that you built.

Ian


mt-ospace usage in m32r and fr30

2011-03-24 Thread Paolo Bonzini
These targets are using -Os to build target libraries.  Perhaps the 
right thing to do instead would be to disable some optimizations 
selectively in the compiler?


Thanks!

Paolo


Re: Complex vectorization

2011-03-24 Thread Richard Henderson
On 03/24/2011 07:47 AM, Richard Guenther wrote:
> Multiplication is inlined for -fcx-fortran-rules for example.  Yes, division
> is always out-of-line.

Division is inlined with -fcx-limited-range.


r~


Re: Second GCC 4.6.0 release candidate is now available

2011-03-24 Thread Michael Hope
On Tue, Mar 22, 2011 at 11:12 AM, Jakub Jelinek  wrote:
> A second GCC 4.6.0 release candidate is available at:
>
> ftp://gcc.gnu.org/pub/gcc/snapshots/4.6.0-RC-20110321/
>
> Please test the tarballs and report any problems to Bugzilla.
> CC me on the bugs if you believe they are regressions from
> previous releases severe enough to block the 4.6.0 release.
>
> If no more blockers appear I'd like to release GCC 4.6.0
> early next week.

The RC bootstraps C, C++, Fortran, Obj-C, and Obj-C++ on
ARMv7/Cortex-A9/Thumb-2/NEON, ARMv5T/ARM/softfp, ARMv5T/Thumb/softfp,
and ARMv4T/ARM/softfp.  I'm afraid I haven't reviewed the test results
(Richard? Ramana?)

See:
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02298.html
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02391.html
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02394.html
 http://gcc.gnu.org/ml/gcc-testresults/2011-03/msg02393.html

and:
 http://builds.linaro.org/toolchain/gcc-4.6.0-RC-20110321/logs/

-- Michael


gcc-4.5-20110324 is now available

2011-03-24 Thread gccadmin
Snapshot gcc-4.5-20110324 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.5-20110324/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.5 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_5-branch 
revision 171425

You'll find:

 gcc-4.5-20110324.tar.bz2 Complete GCC (includes all of below)

  MD5=9e8dfb8a5e75b885337699c79b0ed1ba
  SHA1=2a2b6fdbd610d6c8915bdc24f42839a6df8efb13

 gcc-core-4.5-20110324.tar.bz2C front end and core compiler

  MD5=ed1a6ad884f7650ec7de81c6956744dc
  SHA1=e72e476df8bb83bd0406ece833ecb327631aa0c8

 gcc-ada-4.5-20110324.tar.bz2 Ada front end and runtime

  MD5=5fe17340ca5d91afc07b572d23b83615
  SHA1=1b26b2ac9c489bdfa1be96cd4469201633e1ae96

 gcc-fortran-4.5-20110324.tar.bz2 Fortran front end and runtime

  MD5=9bac9af4671ccd9720f6abd11e4e983e
  SHA1=912e0628587622284fe69274c6dc0fec7a97224e

 gcc-g++-4.5-20110324.tar.bz2 C++ front end and runtime

  MD5=221d157533f5fb7fdc82e0acc30a2013
  SHA1=4720c3e002441f5c59998ee39db2e3ac613bdd8b

 gcc-go-4.5-20110324.tar.bz2  Go front end and runtime

  MD5=d6d5d5c37ac87a240109966f68066a74
  SHA1=71fac6499f5cffa48c63420dee6497f1346a36a5

 gcc-java-4.5-20110324.tar.bz2Java front end and runtime

  MD5=83e088b55efaac658ecf8a2a1bbe2ed1
  SHA1=02ed9eed4f68d97ccad18194db11eecb94c14d6e

 gcc-objc-4.5-20110324.tar.bz2Objective-C front end and runtime

  MD5=232225b6f202f6356e7046cca5ec19c8
  SHA1=413266384c9e677e86a0fb0b7f55267cdfcffd5b

 gcc-testsuite-4.5-20110324.tar.bz2   The GCC testsuite

  MD5=0c2c790e7a59dbb25117dca6927e7666
  SHA1=e61bd8ece43001d57f60eb6fb6e7a95e09768fee

Diffs from 4.5-20110317 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.5
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.