date:20050321

Re: AVR: CC0 to CCmode conversion

2005-03-21 Thread Paul Schlie

> From: Denis Chertykov <[EMAIL PROTECTED]>
>> - possibly something like: ?
>> 
>>   (define_insn "*addhi3"
>> [(set (match_operand:HI 0 ...)
>>(plus:HI (match_operand:HI 1 ...)
>> (match_operand:HI 2 ...)))
>>  (set (reg ZCMP_FLAGS)
>>(compare:HI (plus:HI (match_dup 1) (match_dup 2))) (const_int 0))
>>  (set (reg CARRY_FLAGS)
>>(compare:HI (plus:HI (match_dup 1) (match_dup 2))) (const_int 0))]
>> ""
>> "@ add %A0,%A2\;adc %B0,%B2
>>..."
>> [(set_attr "length" "2, ...")])
> 
> You have presented a very good example. Are you know any port which
> already used this technique ?
> As I remember - addhi3 is a special insn which used by reload.
> The reload will generate addhi3 and reload will have a problem with
> two modified regs (ZCMP_FLAGS, CARRY_FLAGS) which will be a bad
> surprise for reload. :( As I remember.

Thanks for your patience, and now that I understand GCC's spill/reload
requirements/limitations a little better; I understand your desire to merge
compare-and-branch.

However, as an alternative to merging compare-and-branch's to overcome the
fact that the use of a conventional add operation to calculate the effective
spill/reload address for FP offsets >63 bytes would corrupt the machine's
cc-state that a following conditional skip/branch may be dependant on;
I wonder if it may be worth considering simply saving the status register to
a temp register and restoring it after computing the spill/reload address
when a large FP offset is required. (which seems infrequent relative to
those with <63 byte offsets, so would typically not seem to be required?)

If this were done, then not only could compares be split from branches, and
all side-effects fully disclosed; but all compares against 0 resulting from
any arbitrary expression calculation may be initially directly optimized
without relying on a subsequent peephole optimization to accomplish.

Further, if there were a convenient way to determine if the now fully
exposed cc-status register was "dead" (i.e. having no dependants), then
it should be then possible to eliminate its preservation when calculating
large FP offset spill/reload effective addresses, as it would be known that
no subsequent conditional skip/branch operations were dependant on it.

With this same strategy, it may even be desirable to then conditionally
preserve the cc-status register abound all corrupting effective address
calculations when cc-status register is not "dead", as it would seem to
be potentially more efficient to do so rather than otherwise needing
to re-compute an explicit comparison afterward?

(Observing that I'm basically suggesting treating the cc-status register
 like any other hard register, who's value would need to be saved/restored
 around any corrupting operation if it's value has live dependants; what's
 preventing GCC's register and value dependency tracking logic from being
 able to manage its value properly just like it can for other register
 allocated values ?)

GCC no longer synthesizing v2sf operations from v4sf operations?

2005-03-21 Thread Richard Guenther

Hi!

For

typedef float v4sf __attribute__((vector_size(16)));
void foo(v4sf *a, v4sf *b, v4sf *c)
{
*a = *b + *c;
}

we no longer (since 4.0) synthesize v2sf (aka sse) operations
for f.i. -march=athlon (not that we were too successful at this
in 3.4 - we generated horrible code instead).  Instead for !sse2
architectures we generate standard i387 FP code (with some
unnecessary temporaries, but reasonably well).

Does this mean the generic manual vectorization with +-* experiment
has failed?  I.e. are we really supposed to use ?mmintrin.h and friends,
or of course rely on auto-vectorization?

Thanks for clarification,
Richard.

--
Richard Guenther 
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

Useless vectorization of small loops

2005-03-21 Thread Richard Guenther

Hi!

On mainline we now use loop versioning and peeling for alignment
for the following loop (-march=pentium4):

void foo3(float * __restrict__ a, float * __restrict__ b,
  float * __restrict__ c)
{
int i;
for (i=0; i<4; ++i)
a[i] = b[i] + c[i];
}

which results only in slower and larger code.  I also cannot
see why we zero the mm registers before loading and why we
load them high/low separated:

.L13:
xorps   %xmm1, %xmm1
movlps  (%edx,%esi), %xmm1
movhps  8(%edx,%esi), %xmm1
xorps   %xmm0, %xmm0
movlps  (%edx,%ebx), %xmm0
movhps  8(%edx,%ebx), %xmm0
addps   %xmm0, %xmm1
movaps  %xmm1, (%edx,%eax)
addl$1, %ecx
addl$16, %edx
cmpl%ecx, -16(%ebp)
ja  .L13


but the point is, there is nothing to win vectorizing the loop
in the first place if we do not know alignment before.

Richard.

Re: Useless vectorization of small loops

2005-03-21 Thread Richard Guenther

On Mon, 21 Mar 2005 13:45:19 +0100 (CET), Richard Guenther
<[EMAIL PROTECTED]> wrote:
> Hi!
> 
> On mainline we now use loop versioning and peeling for alignment
> for the following loop (-march=pentium4):
> 
> void foo3(float * __restrict__ a, float * __restrict__ b,
>   float * __restrict__ c)
> {
> int i;
> for (i=0; i<4; ++i)
> a[i] = b[i] + c[i];
> }
> 
> which results only in slower and larger code.  I also cannot
> see why we zero the mm registers before loading and why we
> load them high/low separated:
> 
> .L13:
> xorps   %xmm1, %xmm1
> movlps  (%edx,%esi), %xmm1
> movhps  8(%edx,%esi), %xmm1
> xorps   %xmm0, %xmm0
> movlps  (%edx,%ebx), %xmm0
> movhps  8(%edx,%ebx), %xmm0
> addps   %xmm0, %xmm1
> movaps  %xmm1, (%edx,%eax)
> addl$1, %ecx
> addl$16, %edx
> cmpl%ecx, -16(%ebp)
> ja  .L13
> 
> but the point is, there is nothing to win vectorizing the loop
> in the first place if we do not know alignment before.

Uh, and with -funroll-loops we seem to be lost completely, as we
produce peeling/loops for a eight times four rolling loop!  Where is
the information about the loop counter gone??

It looks like vectorization interacts badly with the rest of the loop
optimizers.

Ugh.

Richard.

Specifying alignment of pointer targets

2005-03-21 Thread Richard Guenther

Hi!

I'd like to specify (for vectorization) the alignment of the
target of a pointer.  I.e. I have a vector of floats that I
know is suitable aligned and that get's passed to a function
like

typedef  afloatp;

void foo(afloatp __restrict__ a, afloatp __restrict__ b,
 afloatp __restrict__ c)
{
  int i;
  for (i=0; i<4; ++i)
a[i] = b[i] + c[i];
}

now, the obvious

typedef float __attribute__((aligned(16))) * afloatp;

doesn't have any effect on (*a)s alignment, and specifying
the alignment in the function argument list like

void foo(float __attribute__((aligned(16))) * __restrict__ a,
 float __attribute__((aligned(16))) * __restrict__ b,
 float __attribute__((aligned(16))) * __restrict__ c)

gets me

simd.c:12: error: alignment may not be specified for 'a'
simd.c:13: error: alignment may not be specified for 'b'
simd.c:14: error: alignment may not be specified for 'c'

which I find confusing.  Specifying alignment of the pointer
itself gets me beyond compiling but of course doesn't buy me
anything (the results are similar to using the typedef).

The only way I was able to convince gcc that the target of
a is aligned is using *ghasp* an aligned struct like

struct v4sf { float v[4]; } __attribute__((aligned(16)));

void foo(struct v4sf * __restrict__ a, struct v4sf * __restrict__ b,
struct v4sf * __restrict__ c)
{
int i;
for (i=0; i<4; ++i)
a->v[i] = b->v[i] + c->v[i];
}

!?

Is this really the only way?  Is it supposed to be the only way?
I remember the thread about alignment specifications for arrays
and agree that

  float __attribute__((aligned(16))) x[4];

is ill-formed, but

  float x[4] __attribute__((aligned(16)));
and
  float __attribute__((aligned(16))) *x;

are not?


Thanks for any suggestions,
Richard.

--
Richard Guenther 
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

Re: Specifying alignment of pointer targets

2005-03-21 Thread Richard Guenther

On Mon, 21 Mar 2005, Richard Guenther wrote:

> I'd like to specify (for vectorization) the alignment of the
> target of a pointer.  I.e. I have a vector of floats that I
> know is suitable aligned and that get's passed to a function
> like
>
> typedef  afloatp;
>
> void foo(afloatp __restrict__ a, afloatp __restrict__ b,
>afloatp __restrict__ c)
> {
>   int i;
>   for (i=0; i<4; ++i)
> a[i] = b[i] + c[i];
> }
>
> now, the obvious
>
> typedef float __attribute__((aligned(16))) * afloatp;
>
> doesn't have any effect on (*a)s alignment, and specifying
> the alignment in the function argument list like

In fact,

#include 

typedef float __attribute__((aligned(16))) afloat;
typedef float __attribute__((aligned(16))) * afloatp;
typedef float afloata[4] __attribute__((aligned(16)));
void foo2(afloat * __restrict__ a, afloatp __restrict__ b,
  afloata c)
{
printf("%i %i %i %i\n", __alignof__(*a), __alignof__(a[1]),
   __alignof__(a[2]), __alignof__(a[3]));
printf("%i %i %i %i\n", __alignof__(*b), __alignof__(b[1]),
   __alignof__(b[2]), __alignof__(b[3]));
printf("%i %i %i %i\n", __alignof__(c[0]), __alignof__(c[1]),
   __alignof__(c[2]), __alignof__(c[3]));
}

int main()
{
float x;
foo2(&x, &x, &x);
return 0;
}

compiled with -O2 -fno-inline prints

16 16 16 16
4 4 4 4
4 4 4 4

and the first is obviously not what we want, though
element stride seems to be still four in this case.
Ideally we'd get from a solution

16 4 8 4

though

16 4 4 4

would be acceptable, too.

Richard.

Re: Useless vectorization of small loops

2005-03-21 Thread Dorit Naishlos





> Hi!
>
> On mainline we now use loop versioning and peeling for alignment
> for the following loop (-march=pentium4):
>

we don't yet use loop-versioning in the vectorizer in mainline (we do in
autovect). we do apply peeling.

> void foo3(float * __restrict__ a, float * __restrict__ b,
>  float * __restrict__ c)
> {
> int i;
> for (i=0; i<4; ++i)
> a[i] = b[i] + c[i];
> }
>
> which results only in slower and larger code.  I also cannot
> see why we zero the mm registers before loading and why we
> load them high/low separated:
>
> .L13:
> xorps   %xmm1, %xmm1
> movlps  (%edx,%esi), %xmm1
> movhps  8(%edx,%esi), %xmm1
> xorps   %xmm0, %xmm0
> movlps  (%edx,%ebx), %xmm0
> movhps  8(%edx,%ebx), %xmm0
> addps   %xmm0, %xmm1
> movaps  %xmm1, (%edx,%eax)
> addl$1, %ecx
> addl$16, %edx
> cmpl%ecx, -16(%ebp)
> ja  .L13
>
>
> but the point is, there is nothing to win vectorizing the loop
> in the first place if we do not know alignment before.
>

The vectorizer is currently greedy - vectorizes as much as it can, no cost
considerations applied yet. Since it is not on by default under any
optimization level, and is relatively new and requires as much testing as
possible, this seemed like a reasonable approach.
Indeed, as we are handling more and more cases (unknown loop bound,
misalignment) and introducing more and more overheads, it is starting to be
imperative to consider cost and size treadoffs. (It's also on the
vectorizer wish-list -
http://gcc.gnu.org/projects/tree-ssa/vectorization.html#vec_todo).

dorit

> Richard.
>

Re: Useless vectorization of small loops

2005-03-21 Thread Dorit Naishlos

> On Mon, 21 Mar 2005 13:45:19 +0100 (CET), Richard Guenther
> <[EMAIL PROTECTED]> wrote:
> ...
>
> Uh, and with -funroll-loops we seem to be lost completely, as we
> produce peeling/loops for a eight times four rolling loop!  Where is
> the information about the loop counter gone??
>

the thing is you don't know at compile time what is the alignment of the
access you're peeling for, so the peel-loop has unknown number of
iterations, and consequently the "main" (vectorized) loop has unknown
number of iterations.

dorit

> Ugh.
>
> Richard.

Re: Useless vectorization of small loops

2005-03-21 Thread Richard Guenther

On Mon, 21 Mar 2005, Dorit Naishlos wrote:

>
>
>
>
> > On Mon, 21 Mar 2005 13:45:19 +0100 (CET), Richard Guenther
> > <[EMAIL PROTECTED]> wrote:
> > ...
> >
> > Uh, and with -funroll-loops we seem to be lost completely, as we
> > produce peeling/loops for a eight times four rolling loop!  Where is
> > the information about the loop counter gone??
> >
>
> the thing is you don't know at compile time what is the alignment of the
> access you're peeling for, so the peel-loop has unknown number of
> iterations, and consequently the "main" (vectorized) loop has unknown
> number of iterations.

Ah, ok, I see.  I guess there is no way to propagate information
on the upper bound for the loop count (which is <= 4 in any case here).
Without -funroll-loops we are currently not able
to remove the loop exit test, i.e. we keep zeroing the IV at the
beginning, adding four and then comparing with four and conditionally
branching back...  Unrolling removes this, but has bad effects on
eventually peeled loops.  Of course this is yet another artifact
of tree-complete-peeling not enabled by default / not enablable(?)
without generic rtl loop unrolling.

Richard.

--
Richard Guenther 
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

Re: GCC no longer synthesizing v2sf operations from v4sf operations?

2005-03-21 Thread Uros Bizjak

Hello!
typedef float v4sf __attribute__((vector_size(16)));
void foo(v4sf *a, v4sf *b, v4sf *c)
{
   *a = *b + *c;
}
we no longer (since 4.0) synthesize v2sf (aka sse) operations
for f.i. -march=athlon (not that we were too successful at this
in 3.4 - we generated horrible code instead).  Instead for !sse2
architectures we generate standard i387 FP code (with some
unnecessary temporaries, but reasonably well).
 

SSE _is_ v4sf. 'gcc -O2 -msse -S -fomit-frame-pointer' produces:
foo:
   movl12(%esp), %eax
   movaps  (%eax), %xmm0
   movl8(%esp), %eax
   addps   (%eax), %xmm0
   movl4(%esp), %eax
   movaps  %xmm0, (%eax)
   ret
SSE2 is v2df.
Athlon does not handle SSE insns.
Uros.

Re: GCC no longer synthesizing v2sf operations from v4sf operations?

2005-03-21 Thread Richard Guenther

On Mon, 21 Mar 2005, Uros Bizjak wrote:

> Hello!
>
> >typedef float v4sf __attribute__((vector_size(16)));
> >void foo(v4sf *a, v4sf *b, v4sf *c)
> >{
> >*a = *b + *c;
> >}
> >
> >we no longer (since 4.0) synthesize v2sf (aka sse) operations
> >for f.i. -march=athlon (not that we were too successful at this
> >in 3.4 - we generated horrible code instead).  Instead for !sse2
> >architectures we generate standard i387 FP code (with some
> >unnecessary temporaries, but reasonably well).
> >
> >
> >
> SSE _is_ v4sf. 'gcc -O2 -msse -S -fomit-frame-pointer' produces:
>
> foo:
> movl12(%esp), %eax
> movaps  (%eax), %xmm0
> movl8(%esp), %eax
> addps   (%eax), %xmm0
> movl4(%esp), %eax
> movaps  %xmm0, (%eax)
> ret
>
> SSE2 is v2df.
>
> Athlon does not handle SSE insns.

Oh, so we used to expand to 3dnow?  I see with gcc 3.4 produced:

foo:
pushl   %ebp
movl%esp, %ebp
pushl   %ebx
subl$84, %esp
movl12(%ebp), %eax
movl16(%ebp), %edx
[...]
movq-64(%ebp), %mm0
movl%ebx, -72(%ebp)
movl-36(%ebp), %ebx
movl%ebx, -68(%ebp)
pfadd   -72(%ebp), %mm0
movq%mm0, -56(%ebp)
movl12(%eax), %eax
etc.

This doesn't happen anymore with 4.0/4.1.

Richard.

--
Richard Guenther 
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

[autovect] Bootstrap failure on i686

2005-03-21 Thread Richard Guenther

Hi!

Bootstrap of autovect-branch fails on i686 with

stage1/xgcc -Bstage1/
-B/home/rguenth/ix86/gcc-autovect-210305/i686-pc-linux-gnu/bin/ -c   -O2
-g -fomit-frame-pointer -DIN_GCC   -W -Wall -Wwrite-strings
-Wstrict-prototypes -Wmissing-prototypes -pedantic -Wno-long-long
-Wno-variadic-macros -Wold-style-definition -Werror-DHAVE_CONFIG_H
-I. -I. -I/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc
-I/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/.
-I/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/../include
-I/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/../libcpp/include
/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/tree-data-ref.c -o
tree-data-ref.o
cc1: warnings being treated as errors
/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/tree-data-ref.c: In
function 'address_analysis':
/net/alwazn/home/rguenth/src/gcc/gcc-autovect/gcc/tree-data-ref.c:1181:
warning: comparison between signed and unsigned

Richard.

--
Richard Guenther 
WWW: http://www.tat.physik.uni-tuebingen.de/~rguenth/

Re: Extra gcc-3.3 java failures when using expect-5.43

2005-03-21 Thread Kaveh R. Ghazi

 > From: Andrew Haley  
 > 
 > Kaveh R. Ghazi writes:
 >  > After I upgraded to expect-5.43, I noticed that I'm getting extra
 >  > java failures on the 3.3 branch on x86_64-unknown-linux-gnu.  Other
 >  > gcc branches do not have problems.
 >  > 
 >  > http://gcc.gnu.org/ml/gcc-testresults/2005-03/msg01295.html
 >  > 
 >  > I'm using an expect-5.43 binary on x86_64 that was compiled on i686
 >  > if that matters.
 >  > 
 >  > When I back down to expect-5.42.1, the testsuite results go back to
 >  > normal.  Anyone else seeing this?
 > 
 > Could you post a snippet of the log, please?
 > Andrew.

There was nothing useful in libjava.log to indicate what the problem
is.  I reran the testsuite with --verbose and all the errors show
up like this:

spawning command /tmp/kg/33/build/x86_64-unknown-linux-gnu/./libjava/gij 
ArrayStore
exp6 file5
close result is child killed: SIGABRT
FAIL: ArrayStore execution - gij test

Don't know who/what is sending a SIGABRT.

Again, if I back down to expect 5.42.1 everything passes.  And also it
only occurs on the 3.3 branch.  Other branches and mainline pass
fine.  So there may be a diff in the testsuite harness. (?)

--Kaveh
--
Kaveh R. Ghazi  [EMAIL PROTECTED]

Re: Ada and ARM build assertion failure

2005-03-21 Thread Geert Bosch

On Mar 21, 2005, at 02:54, Nick Burrett wrote:
This seems to be a reoccurance of PR5677.
I'm sorry, but I can't see any way this is related, could you elaborate?
for Aligned_Word'Alignment use
- Integer'Min (2, Standard'Maximum_Alignment);
+ Integer'Min (4, Standard'Maximum_Alignment);
This patch is wrong, as it implicitly increases the size of 
Aligned_Word from
2 to 4 bytes: size is always a multiple of the alignment.
However, it is really dubious you need to change this package, as it is 
only
used for DEC Ada compatibility on VMS systems.

  -Geert

Re: GCC no longer synthesizing v2sf operations from v4sf operations?

2005-03-21 Thread Uros Bizjak

Richard Guenther wrote:
Oh, so we used to expand to 3dnow?  I see with gcc 3.4 produced:
foo:
   pushl   %ebp
   movl%esp, %ebp
   pushl   %ebx
   subl$84, %esp
   movl12(%ebp), %eax
   movl16(%ebp), %edx
[...]
   movq-64(%ebp), %mm0
   movl%ebx, -72(%ebp)
   movl-36(%ebp), %ebx
   movl%ebx, -68(%ebp)
   pfadd   -72(%ebp), %mm0
   movq%mm0, -56(%ebp)
   movl12(%eax), %eax
etc.
This doesn't happen anymore with 4.0/4.1.
 

IIRC, any generic code that produces MMX or 3DNow! instructions is 
disabled ATM, because gcc doesn't know how/when to insert emms/femms 
instruction. You don't want to mix 3dNow insns with x87 insn and use 
shared 3DNow/x87 registers without this insn...

Uros.

Re: AVR indirect_jump addresses limited to 16 bits

2005-03-21 Thread Marek Michalkiewicz

On Sun, Mar 20, 2005 at 04:29:01PM -0800, Richard Henderson wrote:

> The easiest way is to do this in the linker instead of the compiler.
> See the xstormy16 port and how it handles R_XSTORMY16_FPTR16.  This
> has the distinct advantage that you do not commit to the creation of
> an indirect jump until you discover that the target label is outside
> the low 64k.

Looks perfect to me.  So we are not the first architecture needing
such tricks...  AVR would need 3 new relocs, used like this:

.word pm16(label)

ldi r30,pm16_lo8(label)
ldi r31,pm16_hi8(label)

and the linker can do the rest of the magic (add jumps in a section
below 64K words if the label is above).

Cc: to Denis, as I may need help actually implementing these changes
(you know binutils internals much better than I do).

Thanks,
Marek

Re: Ada and ARM build assertion failure

2005-03-21 Thread Nick Burrett

Geert Bosch wrote:
On Mar 21, 2005, at 02:54, Nick Burrett wrote:
This seems to be a reoccurance of PR5677.

I'm sorry, but I can't see any way this is related, could you elaborate?
Sorry, I completely misread the PR.  It is not related.
for Aligned_Word'Alignment use
- Integer'Min (2, Standard'Maximum_Alignment);
+ Integer'Min (4, Standard'Maximum_Alignment);

This patch is wrong, as it implicitly increases the size of Aligned_Word 
from
2 to 4 bytes: size is always a multiple of the alignment.
OK, but if I don't apply the patch, GNAT complains that the alignment 
should be 4, not 2 and compiling ceases.

However, it is really dubious you need to change this package, as it is 
only
used for DEC Ada compatibility on VMS systems.
OK, but all systems build it, as it is unconditionally defined in 
Makefile.rtl::GNATRTL_NONTASKING_OBJS

And here it exists in a i686-linux build:
[EMAIL PROTECTED] rts]$ ls -l s-aux*
lrwxrwxrwx  1 nick nick50 Mar 18 12:51 s-auxdec.adb -> 
/home/nick/riscos-elf/gcc-4.0/gcc/ada/s-auxdec.adb
lrwxrwxrwx  1 nick nick50 Mar 18 12:51 s-auxdec.ads -> 
/home/nick/riscos-elf/gcc-4.0/gcc/ada/s-auxdec.ads
-r--r--r--  1 nick nick 19835 Mar 18 12:57 s-auxdec.ali
-rw-rw-r--  1 nick nick 32908 Mar 18 12:57 s-auxdec.o
[EMAIL PROTECTED] rts]$

Nick.

Re: Ada and ARM build assertion failure

2005-03-21 Thread Geert Bosch

On Mar 21, 2005, at 11:02, Nick Burrett wrote:
OK, but if I don't apply the patch, GNAT complains that the alignment 
should be 4, not 2 and compiling ceases.
Yes, this is related to PR 17701 as Arno pointed out to me in a private 
message.
Indeed, the patch you used works around this failure and can be used as 
a kludge.
Properly disabling building this package would be better, but there 
isn't a mechanism
for that yet.

However, this all is entirely unrelated to the failure you're seeing.

Re: AVR: CC0 to CCmode conversion

2005-03-21 Thread Denis Chertykov

Richard Henderson <[EMAIL PROTECTED]> writes:

> On Sun, Mar 20, 2005 at 01:59:44PM +0300, Denis Chertykov wrote:
> > The reload will generate addhi3 and reload will have a problem with
> > two modified regs (ZCMP_FLAGS, CARRY_FLAGS) which will be a bad
> > surprise for reload. :( As I remember.
> 
> In order to expose the flags register before reload, you *must*

Precisely to say while reload_in_progress.

> have load, store, reg-reg move, and add operations that do not
> modify the flags.

They (load, store, add) can modify flags before reload.
(while no reload_in_progress)
Is this OK ?

Denis.

Re: AVR: CC0 to CCmode conversion

2005-03-21 Thread Denis Chertykov

Paul Schlie <[EMAIL PROTECTED]> writes:

> > From: Denis Chertykov <[EMAIL PROTECTED]>
> >> - possibly something like: ?
> >> 
> >>   (define_insn "*addhi3"
> >> [(set (match_operand:HI 0 ...)
> >>(plus:HI (match_operand:HI 1 ...)
> >> (match_operand:HI 2 ...)))
> >>  (set (reg ZCMP_FLAGS)
> >>(compare:HI (plus:HI (match_dup 1) (match_dup 2))) (const_int 0))
> >>  (set (reg CARRY_FLAGS)
> >>(compare:HI (plus:HI (match_dup 1) (match_dup 2))) (const_int 0))]
> >> ""
> >> "@ add %A0,%A2\;adc %B0,%B2
> >>..."
> >> [(set_attr "length" "2, ...")])
> > 
> > You have presented a very good example. Are you know any port which
> > already used this technique ?
> > As I remember - addhi3 is a special insn which used by reload.
> > The reload will generate addhi3 and reload will have a problem with
> > two modified regs (ZCMP_FLAGS, CARRY_FLAGS) which will be a bad
> > surprise for reload. :( As I remember.
> 
> Thanks for your patience, and now that I understand GCC's spill/reload
> requirements/limitations a little better; I understand your desire to merge
> compare-and-branch.

I don't want to merge compare-and-branch because (as Richard said)
"explicit compare elimination by creating
 even larger fused operate-compare-and-branch instructions that
 could be recognized by combine.  I wouldn't actually recommend
 this though, because branch instructions with output reloads are
 EXTREMELY DIFFICULT to implement properly.".
  (IMPOSSIBLE for AVR) 

I want to have two separate insns compare and branch.

> However, as an alternative to merging compare-and-branch's to
> overcome the fact that the use of a conventional add operation to
> calculate the effective spill/reload address for FP offsets >63
> bytes would corrupt the machine's cc-state that a following
> conditional skip/branch may be dependant on; I wonder if it may be
> worth considering simply saving the status register to a temp
> register and restoring it after computing the spill/reload address
> when a large FP offset is required. (which seems infrequent relative
> to those with <63 byte offsets, so would typically not seem to be
> required?)
> 
> If this were done, then not only could compares be split from branches, and
> all side-effects fully disclosed; but all compares against 0 resulting from
> any arbitrary expression calculation may be initially directly optimized
> without relying on a subsequent peephole optimization to accomplish.
> 
> Further, if there were a convenient way to determine if the now fully
> exposed cc-status register was "dead" (i.e. having no dependants), then
> it should be then possible to eliminate its preservation when calculating
> large FP offset spill/reload effective addresses, as it would be known that
> no subsequent conditional skip/branch operations were dependant on it.
> 
> With this same strategy, it may even be desirable to then conditionally
> preserve the cc-status register abound all corrupting effective address
> calculations when cc-status register is not "dead", as it would seem to
> be potentially more efficient to do so rather than otherwise needing
> to re-compute an explicit comparison afterward?

I think that it's a better way. I will test it.

> (Observing that I'm basically suggesting treating the cc-status register
>  like any other hard register, who's value would need to be saved/restored
>  around any corrupting operation if it's value has live dependants; what's
>  preventing GCC's register and value dependency tracking logic from being
>  able to manage its value properly just like it can for other register
>  allocated values ?)

Why not CCmode register ?

Denis.

Re: AVR indirect_jump addresses limited to 16 bits

2005-03-21 Thread Denis Chertykov

Marek Michalkiewicz <[EMAIL PROTECTED]> writes:

> On Sun, Mar 20, 2005 at 04:29:01PM -0800, Richard Henderson wrote:
> 
> > The easiest way is to do this in the linker instead of the compiler.
> > See the xstormy16 port and how it handles R_XSTORMY16_FPTR16.  This
> > has the distinct advantage that you do not commit to the creation of
> > an indirect jump until you discover that the target label is outside
> > the low 64k.
> 
> Looks perfect to me.  So we are not the first architecture needing
> such tricks...  AVR would need 3 new relocs, used like this:
> 
>   .word pm16(label)
> 
>   ldi r30,pm16_lo8(label)
>   ldi r31,pm16_hi8(label)
> 
> and the linker can do the rest of the magic (add jumps in a section
> below 64K words if the label is above).
> 
> Cc: to Denis, as I may need help actually implementing these changes
> (you know binutils internals much better than I do).

Richard right. Better to support this in binutils.
Right now I'm busy with CC0 to CCmode conversion.
(you must learn binutils ;)

Denis.

Re: AVR indirect_jump addresses limited to 16 bits

2005-03-21 Thread Paul Schlie




> From: Marek Michalkiewicz <[EMAIL PROTECTED]>
>> On Sun, Mar 20, 2005 at 04:29:01PM -0800, Richard Henderson wrote:
>> The easiest way is to do this in the linker instead of the compiler.
>> See the xstormy16 port and how it handles R_XSTORMY16_FPTR16.  This
>> has the distinct advantage that you do not commit to the creation of
>> an indirect jump until you discover that the target label is outside
>> the low 64k.
> 
> Looks perfect to me.  So we are not the first architecture needing
> such tricks...  AVR would need 3 new relocs, used like this:
> 
> .word pm16(label)
> 
> ldi r30,pm16_lo8(label)
> ldi r31,pm16_hi8(label)
> 
> and the linker can do the rest of the magic (add jumps in a section
> below 64K words if the label is above).
> 
> Cc: to Denis, as I may need help actually implementing these changes
> (you know binutils internals much better than I do).

- yup, and nicer than trying to play games with alignment, etc.,

  And just to double check, using the earlier example:

> int foo(int dest)
> {
>__label__ l1, l2, l3;
>void *lb[] = { &&l1, &&l2, &&l3 };
>int x = 0;
> 
>goto *lb[dest];
> 
> l1:
>x += 1;
> l2:
>x += 1;
> l3:
>x += 1;
>return x;
> }

  It would seem that the only time the pm16(label) address would ever
  be used, would as an initializing constant pointer value being assigned
  to a  _label_/function pointer variable; as a CALL/JUMP LABEL instruction
  would be used to call/jump-to the true entry point directly otherwise.
  (is that correct?)

Re: Weird behavior in ivopts code

2005-03-21 Thread Jeffrey A Law

On Fri, 2005-03-18 at 18:25 +0100, Zdenek Dvorak wrote:
> Hello,
> 
> > Which appears to walk down the array and try and choose better IV sets.
> > Since it walks down the IV array, which is in SSA_NAME_VERSION order.
> > Thus two loops which are equivalent in all ways except that they use
> > different SSA_NAME_VERSIONs can get different IV sets.
> > 
> > Anyway, the instability of the IV opts code when presented with loops
> > that differ only in the version #s in the SSA_NAMEs they use is really
> > getting in the way of understanding the performance impact of the
> > new jump threading code.  I would expect this instability to also make
> > it difficult to analyze the IVopts in general.
> 
> there's not much to do about the matter.  The choice of ivs in ivopts is
> just a heuristics (and this cannot be changed, due to compiler
> performance reasons), and as such it is prone to such instabilities.
> In fact, both choices of ivs make sense, and they have the same cost
> w.r. to the cost function used by ivopts.
Sigh.  I had a feeling that might be the case.

> 
> Anyway, in this particular case, the patch below should make ivopts
> to prefer the choice of preserving the variable 'i' (which is better
> from the point of preserving debugging information).
Seems reasonable to me given that it preserves debugging information;
it also seems to stabilize IV opts significantly for EON (I was seeing
+-20% performance swings due to this issue for EON).

Consider it pre-approved once you run a bootstrap and regression test.

jeff

Re: AVR: CC0 to CCmode conversion

2005-03-21 Thread Paul Schlie

> From: Denis Chertykov <[EMAIL PROTECTED]>
>> Paul Schlie <[EMAIL PROTECTED]> writes:
>> ...
>> (Observing that I'm basically suggesting treating the cc-status register
>>  like any other hard register, who's value would need to be saved/restored
>>  around any corrupting operation if it's value has live dependants; what's
>>  preventing GCC's register and value dependency tracking logic from being
>>  able to manage its value properly just like it can for other register
>>  allocated values ?)
> 
> Why not CCmode register ?

- For what? As it would seem that as along as all rtl instruction data-flow
  dependencies are satisfied, the code will execute the program correctly ?

 (as all conditionals are effectively based upon a comparison of a result
  against 0, and GCC always converts both operands to the same type, so
  all that's necessary to know is if that type is signed or unsigned, as
  even floats compare just like signed integers. Therefore it would seem
  that the only difference between a compare operation and a subtract, is
  that it doesn't produce a value result which clobbers one of it's
  operands; otherwise they're identical, therefore arguably just an
  optimization to be used instead of a subtract when the result value
  isn't needed or invalid when comparing floats. It would seem ?)

Runtime-library versioning patch checked in

2005-03-21 Thread Zack Weinberg


I have checked in the patch to clean up after GCC's change to version
number handling.  This should address all reported issues with build,
installation, etc.  Per Ian's suggestion, I am doing a multilib-ful
build with a relative $(srcdir), which may expose more problems, which
will be addressed in a follow-up.

It is now safe again to modify configure scripts in the gcc
repository; you should not encounter missing .m4 files.  I have not
audited the src repository for problematic constructs.  The odds are
low that any problems exist, since the affected macros and variables
all have to do with passing information from the gcc subdirectory to
runtime library directories, but just for the record:
TL_AC_GCC_VERSION no longer exists, and TL_AC_GXX_INCLUDE_DIR has
changed semantics.  Also, the top level Makefile no longer passes down
certain variables: libsubdir, libstdcxx_incdir, gxx_include_dir,
gcc_version, gcc_version_full, and gcc_version_trigger.

I would like to take this opportunity to encourage the libjava,
libffi, and libstdc++ maintainers to convert to nonrecursive
Makefiles.  (In other words, just one Makefile.am/Makefile.in pair at
the top level of your subdirectory; perhaps with included fragments in
lower-level subdirectories.)

zw

removal of -mflat in GCC 4.0 (sparc)

2005-03-21 Thread Garrett D'Amore

Esteemed GCC developers:
I am writing to request the that the sparc -mflat option be retained in 
GCC 4.0.

The reason this particular register model is important to me is that I 
use GCC on the microSPARC-IIep (actually, a SoC variant produced by 
Infineon called the "copernicus") to build firmware for Sun Ray 
appliances (ultra-thin client).  These SparcV8 processors only have two 
register windows, so the -mflat option is required for performance.  
(The Sun Ray kernel also doesn't have a handler for register window 
over/underflow traps.)  The register windows are used exclusively for 
entry into the interrupt handler.

I believe that the uSPARC-IIep is a truly "open-source" processor: the 
specs for the entire processor are freely downloadable from Sun -- 
anyone can produce cores with this chip for free.  While I'm not a VLSI 
engineer, I think that everything one needs to go to production on such 
cores is already freely available from Sun.  (So I'm saying that the 
chip *implementation* is open, not just the instruction set or pin-outs.)

Note also that these parts (Copernicus at least) are real, currently 
shipping products.  This is *not* a legacy product, and active new 
development is continuing on this stuff.

Btw, I think the uSPARC-IIep is also used certain older JavaStations, 
which can and do run Linux.  I'm not sure how many of them there are out 
there, but Linux seems to be the OS of choice for these boxes.  I don't 
know whether -mflat is used in that environment or not.  It certainly 
seems like it could be, too good effect for performance.  (Otherwise 
nearly every save/restore would involve an interrupt for window overflow 
and underflow.)

I do realize (and am very grateful) that GCC is a free project, and that 
the needs of the many may outweigh the needs of the few.  (I am also 
asking my company to make a donation to the FSF in support of GCC, 
independent of the decision about -mflat.  Whether such a donation is 
made or not depends on management though, so I can't promise anything.)

But I also thought it was a real possibility that the GCC team might not 
be aware of anyone who had a real requirement for the -mflat option.  I 
certainly appreciate the value of trimming support for defunct 
architectures and such from a product to reduce code size and 
complexity.  However, in this case such a trim does potentially impact 
otherwise happy GCC users.

I'd be grateful if the team could consider reversing the decision on 
-mflat.  While I can continue to use older versions of GCC, I'd like the 
ability to update to new compilers later if possible.  (It certainly 
seems like some of the new optimizations in GCC 4.0 might be very useful 
to us.)

Thank you in advance for your consideration of this request.
   -- Garrett

Re: Copyright question: libgcc GPL exceptions

2005-03-21 Thread Mike Stump

On Mar 19, 2005, at 7:23 AM, Bernd Schmidt wrote:
I'm updating the copyrights in the Blackfin port, and I noticed  
that there appear to be two versions of the wording that allows  
more-or-less unlimited use of libgcc files.  One can be found e.g.  
in config/arm/crtn.asm:

  As a special exception, if you link this library with files
  compiled with GCC to produce an executable, this does not cause
  the resulting executable to be covered by the GNU General Public
  License.
  This exception does not however invalidate any other reasons why
  the executable file might be covered by the GNU General Public
  License.
the other in config/arm/lib1funcs.asm:
  In addition to the permissions in the GNU General Public License,  
the
  Free Software Foundation gives you unlimited permission to link the
  compiled version of this file into combinations with other programs,
  and to distribute those combinations without any restriction coming
  from the use of this file.  (The General Public License restrictions
  do apply in other respects; for example, they cover modification of
  the file, and distribution when not linked into a combine
  executable.)
The canonical form can be found in gcc/libgcc2.c:
In addition to the permissions in the GNU General Public License, the
Free Software Foundation gives you unlimited permission to link the
compiled version of this file into combinations with other programs,
and to distribute those combinations without any restriction coming
from the use of this file.  (The General Public License restrictions
do apply in other respects; for example, they cover modification of
the file, and distribution when not linked into a combine
executable.)
Is there a particular reason to use one or the other, or are they  
equivalent?
Yes, one is old and should be updated to the canonical form.

Licensing question about libobjc

2005-03-21 Thread Andrew Pinski

I notice that libobjc have a different exception than all of the other 
ones
which have an exception to the GPL.  Is there is a reason behind this?

The different between the libobjc exception and the one in 
libgcc/libstdc++ is that
the exception only takes into account when all sources were compiled 
with GCC.

Thanks,
Andrew Pinski

Obsoleting more ports for 4.0.

2005-03-21 Thread Kazu Hirata

Hi,

First off, Mark, if you think this stuff is too late for 4.0, I'll
postpone this to 4.1.  Please note that all we have to do is add a few
lines to config.gcc as far as printing the "obsolete" message is
concerned.

Below, I propose to obsolete the following three architectures for GCC
4.0 and remove them for 4.1 unless somebody steps up and does *real
work*.

If you are working on these ports, please send us real patches.

If you would like to work on these ports privately, please refrain
from telling us that port xxx should be kept.  More ports we have,
more work we would have to do to clean up the infrastructure
connecting the middle end and the backends.

arc
---

  No maintainer.

  PR 8972 hasn't been fixed.  GCC miscompiles a function as simple as.

int
f (int x, int i)
{
  return x << i;
}

  There were some recent interests like

http://gcc.gnu.org/ml/gcc/2004-10/msg00408.html

  Obsoleting a port on the grounds of a single bug may seem a bit
  strange.  However, PR 8972 implies that nobody is working on the
  FSF's mainline at least.

  PR 8973 hasn't been fixed.

fr30


  The same justification as

http://gcc.gnu.org/ml/gcc/2004-06/msg01113.html

  Nobody showed an interest in keeping this port.

i860


  A hardware implementation is not currently being manufactured.

  Jason Eckhardt, the maintainer of i860, has told us that it would be
  OK to go.

http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02033.html

ip2k


  PR 20582: libgcc build fails despite some interests, such as

http://gcc.gnu.org/ml/gcc/2004-06/msg01128.html

ns32k
-

  The same justification as

http://gcc.gnu.org/ml/gcc/2004-06/msg01113.html

  Nobody showed an interest in keeping this port.

Kazu Hirata

Re: Licensing question about libobjc

2005-03-21 Thread Mike Stump

On Mar 21, 2005, at 3:05 PM, Andrew Pinski wrote:
I notice that libobjc have a different exception than all of the  
other ones
which have an exception to the GPL.  Is there is a reason behind this?

The different between the libobjc exception and the one in libgcc/ 
libstdc++ is that
the exception only takes into account when all sources were  
compiled with GCC.
I believe if you researched this, you would find that the derive from  
a common ancestor, and that libobjc just fell behind.

Here is rcsdiff -r1.1 -r1.158 libgcc2.c from oldgcc:
< /* As a special exception, if you link this library with files
 /* As a special exception, if you link this library with other files,
>some of which are compiled with GCC, to produce an executable,
>this library does not by itself cause the resulting executable
>to be covered by the GNU General Public License.

which shows some of the history.
You can update to the current canonical spelling in libgcc2.c.

RFA; DFP and REAL_TYPE?

2005-03-21 Thread Jon Grimm

So I've been looking at using REAL_TYPE to represent decimal floating
point values internally (to implement the C extensions for decimal 
floating point.)  I believe David and yourself had some discussions on 
this some short time back.

Anyway, I've now had a chance to play with this a bit, but not quite 
sure how well I like the way its coming out (though the alternative of 
introducing new type seems worse, imo).  Warning: My thinking is likely 
clouded by a goal to wire in the decNumber routines to implement the 
algorithms/encodings for decimal floats (still working through 
permissions for this to happen though).

First, I think we need to avoid going into the GCC REAL internal
binary float representation for decimal floats.   I'm guessing going 
into the binary representation (then performing various arithmetic 
operations) and then eventually dropping back out to decimal float will 
end up with errors that are trying to be avoided by decimal float in the 
first place.

I'm looking for advice to going forward.  I've already hacked up 
real_value.sig to hold a decimal128 encoded value.  This is fugly, and 
obviously all sorts of things in real.c would break if I started using 
the various functions for real.  But before I put down any significant 
work down the REAL_TYPE path, I thought it best to get guidance.

1) Stick with REAL_TYPE or is it hopeless and I should create DFLOAT_TYPE?
2) If the recommendation is to stick with REAL_TYPE.  Is it ok to have 
some other internal representation?
3) Is there a preferred way to override real_value functions?  I'm 
assuming that even if I use the real_value->sig field to hold the 
coeefficient rather than the ugly hack of holding a decimal128, I'll 
need to override various functions in real.c to 'do the right thing' for 
radix 10 reals.  I could add a field to real_value to point to a 
function table, that if present to be called through. Or simply add 
various "if (r->b == 10) checks throughout real.c.  Or other.

Thoughts/concerns/questions/advice?
Best Regards,
Jon Grimm
IBM Linux Technology Center.

A question about java/lang.c:java_get_callee_fndecl.

2005-03-21 Thread Kazu Hirata

Hi,

I see that the implementation of LANG_HOOKS_GET_CALLEE_FNDECL in Java
always returns NULL (at least for the time being).

static tree
java_get_callee_fndecl (tree call_expr)
{
  tree method, table, element, atable_methods;

  HOST_WIDE_INT index;

  /* FIXME: This is disabled because we end up passing calls through
 the PLT, and we do NOT want to do that.  */
  return NULL;

  :
  :

Is anybody planning to fix this?  If not, I'm thinking about removing
this language hook.  The reason is not just clean up.  Rather it is
because I need to change the prototype of get_callee_fndecl and
LANG_HOOKS_GET_CALLEE_FNDECL..  Currently, fold_ternary has the
following call tree.

  fold_ternary
get_callee_fndecl
  java_get_callee_fndecl

If I change fold_ternary to take components of CALL_EXPR like the
address expression of CALL_EXPR and the argument list, instead of
CALL_EXPR itself, I would have to change java_get_callee_fndecl to
take the first operand of a CALL_EXPR, instead of a CALL_EXPR.

It's not that the change is so involved, but it doesn't make much
sense to keep something dead up to date.

In other words, when I posted the following patch

  http://gcc.gnu.org/ml/gcc-patches/2005-03/msg02038.html

Roger Sayle requested to keep the call to get_callee_fndecl so that we
can "fold" the first operand of a CALL_EXPR to a FUNCTION_DECL.

FYI, the above FIXME comes from

  http://gcc.gnu.org/ml/java-patches/2004-q2/msg00083.html

Kazu Hirata

expand_binop misplacing results?

2005-03-21 Thread DJ Delorie


gcc.c-torture/execute/2403-1.c tripped over this on an internal
(16 bit) port doing SImode subtract.  The comments for expand_binop()
explicitly state that you can't rely on the target being set:

   If TARGET is nonzero, the value
   is generated there, if it is convenient to do so.

but we seem to have that expectation later:

  /* Main add/subtract of the input operands.  */
  x = expand_binop (word_mode, binoptab,
op0_piece, op1_piece,
target_piece, unsignedp, next_methods);

The only place where target_piece is assigned is in the (i > 0) case:

  if (i > 0)
{
  . . .
  emit_move_insn (target_piece, newx);
}

So it seems to me that in the (i == 0) case we need to see if
target_piece happened to receive the result, and if not, assign it.

It seems to me that this kind of bug should have been noticed already,
so... am I missing something?


2005-03-21  DJ Delorie  <[EMAIL PROTECTED]>

* optabs.c (expand_binop): Make sure the first subword's result
gets stored.

Index: optabs.c
===
RCS file: /cvs/gcc/gcc/gcc/optabs.c,v
retrieving revision 1.265
diff -p -U3 -r1.265 optabs.c
--- optabs.c16 Mar 2005 18:29:23 -  1.265
+++ optabs.c22 Mar 2005 01:25:35 -
@@ -1534,6 +1534,11 @@ expand_binop (enum machine_mode mode, op
}
  emit_move_insn (target_piece, newx);
}
+ else
+   {
+ if (x != target_piece)
+   emit_move_insn (target_piece, x);
+   }
 
  carry_in = carry_out;
}

Re: RFA; DFP and REAL_TYPE?

2005-03-21 Thread Mark Mitchell

Jon Grimm wrote:
So I've been looking at using REAL_TYPE to represent decimal floating
point values internally (to implement the C extensions for decimal 
floating point.)  I believe David and yourself had some discussions on 
this some short time back.
FWIW, I'd rather see you stick with REAL_TYPE.  I think that decimal 
floating point is similar enough to binary floating point to make that 
worthwhile.

What I would hope would work would be modifying real.c to (a) directly 
suport the decimal format for storage, and (b) update the emulation of 
floating-point operations to work correctly on the decimal format.  I 
definitely agree that translating into the binary format is likely to 
result in various errors.

I don't have an opinion on exactly what method of modifying real.c would 
be cleanest.

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304

Re: RFA; DFP and REAL_TYPE?

2005-03-21 Thread Robert Dewar

Mark Mitchell wrote:
What I would hope would work would be modifying real.c to (a) directly 
suport the decimal format for storage, and (b) update the emulation of 
floating-point operations to work correctly on the decimal format.  I 
definitely agree that translating into the binary format is likely to 
result in various errors.
I see no reason not to use binary format to store decimal numbers ...
I don't have an opinion on exactly what method of modifying real.c would 
be cleanest.

Re: RFA; DFP and REAL_TYPE?

2005-03-21 Thread Mark Mitchell

Robert Dewar wrote:
Mark Mitchell wrote:
What I would hope would work would be modifying real.c to (a) directly 
suport the decimal format for storage, and (b) update the emulation of 
floating-point operations to work correctly on the decimal format.  I 
definitely agree that translating into the binary format is likely to 
result in various errors.

I see no reason not to use binary format to store decimal numbers ...
I would expect that some decimal floating point values are not precisely 
representable in the binary format.

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304

Re: RFA; DFP and REAL_TYPE?

2005-03-21 Thread Robert Dewar

Mark Mitchell wrote:
I would expect that some decimal floating point values are not precisely 
representable in the binary format.
OK, I agree that decimal floating-point needs its own format. But still
you can store the decimal mantissa and decimal exponent in binary format
without any problem, and that's probably what you want to do on a machine
that does not have native decimal format support. Even on a machine that
does have some support for decimal formats or arithmetic, you want to
check timing to see if these instructions are actually attractive to use.

Re: RFA; DFP and REAL_TYPE?

2005-03-21 Thread Mark Mitchell

Robert Dewar wrote:
Mark Mitchell wrote:
I would expect that some decimal floating point values are not 
precisely representable in the binary format.

OK, I agree that decimal floating-point needs its own format. But still
you can store the decimal mantissa and decimal exponent in binary format
without any problem, and that's probably what you want to do on a machine
that does not have native decimal format support.
I would think that, as elsewhere in real.c, you would probably want to 
use the same exact bit representation that will be used on the target. 
This is useful so that you can easily emit assembly literals by simply 
printing the bytes in hex, for example.

Of course, you could do as you suggest (storing the various fields of 
the decimal number in binary formats), and, yes, on many host machines 
that would in more efficient internal computations.  But, I'm not 
confident that the savings you would get out of that would outweigh the 
appeal of having bit-for-bit consistency between the host and target.

In any case, this is rather a detail; the key decision Jon is trying to 
make is whether or not he has to introduce a new format in real.c, 
together with new routines to perform oeprations on that format, to 
which I think we agree the answer is in the affirmative.

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304

java on darwin8?

2005-03-21 Thread Mike Stump

Certainly I am doing something wrong, but if not...  anyone else  
seeing this?

/Volumes/mrs3/net/gcc-darwin/./gcc/xgcc -B/Volumes/mrs3/net/gcc- 
darwin/./gcc/ -B/Volumes/mrs3/Packages/gcc-20050128/powerpc-apple- 
darwin8.0.0/bin/ -B/Volumes/mrs3/Packages/gcc-20050128/powerpc-apple- 
darwin8.0.0/lib/ -isystem /Volumes/mrs3/Packages/gcc-20050128/powerpc- 
apple-darwin8.0.0/include -isystem /Volumes/mrs3/Packages/ 
gcc-20050128/powerpc-apple-darwin8.0.0/sys-include -m64 - 
DHAVE_CONFIG_H -I. -I../../../../gcc/libjava -I./include -I./gcj - 
I../../../../gcc/libjava -Iinclude -I../../../../gcc/libjava/include - 
I../../../../gcc/libjava/../boehm-gc/include -I../boehm-gc/include - 
I../../../../gcc/libjava/libltdl -I../../../../gcc/libjava/libltdl - 
I../../../../gcc/libjava/.././libjava/../gcc -I../../../../gcc/ 
libjava/../zlib -I../../../../gcc/libjava/../libffi/include -I../ 
libffi/include -Wextra -Wall -O2 -g -O2 -m64 -MT java/lang/sf_fabs.lo  
-MD -MP -MF java/lang/.deps/sf_fabs.Tpo -c ../../../../gcc/libjava/ 
java/lang/sf_fabs.c  -fno-common -DPIC -o java/lang/.libs/sf_fabs.o
In file included from ../../../../gcc/libjava/java/lang/fdlibm.h:23,
 from ../../../../gcc/libjava/java/lang/sf_fabs.c:20:
../../../../gcc/libjava/java/lang/ieeefp.h:157:2: error: #error  
Endianess not declared!!
In file included from ../../../../gcc/libjava/java/lang/mprec.h:30,
 from ../../../../gcc/libjava/java/lang/fdlibm.h:25,
 from ../../../../gcc/libjava/java/lang/sf_fabs.c:20:
../../../../gcc/libjava/java/lang/ieeefp.h:157:2: error: #error  
Endianess not declared!!
In file included from ../../../../gcc/libjava/java/lang/fdlibm.h:25,
 from ../../../../gcc/libjava/java/lang/sf_fabs.c:20:
../../../../gcc/libjava/java/lang/mprec.h:95: error: expected '=',  
',', ';', 'asm' or '__attribute__' before 'one'
In file included from ../../../../gcc/libjava/java/lang/sf_fabs.c:20:
../../../../gcc/libjava/java/lang/fdlibm.h:227:3: error: #error Must  
define endianness
make[4]: *** [java/lang/sf_fabs.lo] Error 1

Re: java on darwin8?

2005-03-21 Thread Andrew Pinski

On Mar 21, 2005, at 10:10 PM, Mike Stump wrote:
Certainly I am doing something wrong, but if not...  anyone else 
seeing this?
You want to change the following "#if" in that file, to include 
__ppc64__:

#if defined (__PPC__) || defined (__ppc__)
Thanks,
Andrew Pinski
Who helped with the first porting of libjava in this place.

Re: expand_binop misplacing results?

2005-03-21 Thread Roger Sayle

Hi DJ,

On Mon, 21 Mar 2005, DJ Delorie wrote:
> 2005-03-21  DJ Delorie  <[EMAIL PROTECTED]>
>
>   * optabs.c (expand_binop): Make sure the first subword's result
>   gets stored.

This is OK for mainline, provided that you bootstrap and regression
test it somewhere.  Thanks.  You're quite right that callers can't
rely on expand_binop placing the result in target, however most
backends and RTL expansion sequences try hard to honor the request.
There does appear to be a bug in the "i == 0" case, which has probably
never been an issue as most targets are either able to place the
result of the addition/subtraction in the requested destination or
provide their own adddi3/addti3 expanders.

Thanks for finding/fixing this.  This might be a candidate for backporting
to the GCC 4.0 branch if we can find a target/testcase that triggers a
problem.

Roger
--

Re: Obsoleting more ports for 4.0.

2005-03-21 Thread Mark Mitchell

Kazu Hirata wrote:
Hi,
First off, Mark, if you think this stuff is too late for 4.0, I'll
postpone this to 4.1.  Please note that all we have to do is add a few
lines to config.gcc as far as printing the "obsolete" message is
concerned.
I think that if you get no objections to your message within a week, 
it's fine to obsolete these for 4.0.  You might consider leaving the 
ports in 4.1 until after 4.0 has been out for a month or so.  That will 
give users a chance to speak up, if they really do want these old ports.

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304

Re: Copyright question: libgcc GPL exceptions

2005-03-21 Thread John Marshall

Mike Stump wrote:
The canonical form can be found in gcc/libgcc2.c:
[...] (The General Public License restrictions
do apply in other respects; for example, they cover modification of
the file, and distribution when not linked into a combine
executable.)
(Been wondering about this for a while...)  Possibly this is one of 
those North American dialect things, but to this (non-American) English 
speaker this canonical form appears to contain a typo.

"[...] when not linked into a combined executable", surely?
   John

ICE in gcc-4.0-20050305 for m68k

2005-03-21 Thread Dan Kegel

I tried building glibc-2.3.4 for m68k-unknown-linux-gnu with gcc-4.0-20050305,
and the compiler fell over in iconv/skeleton.c:
In file included from iso-2022-cn-ext.c:657:
../iconv/skeleton.c: In function 'gconv':
../iconv/skeleton.c:801: internal compiler error: output_operand: invalid 
expression as operand
...
make[2]: Leaving directory 
`...build/m68k-unknown-linux-gnu/gcc-4.0-20050305-glibc-2.3.4/glibc-2.3.4/iconvdata'
I'll post a proper problem report when I get a chance,
this is just a little heads-up.
- Dan
--
Trying to get a job as a c++ developer?  See 
http://kegel.com/academy/getting-hired.html

Re: Copyright question: libgcc GPL exceptions

2005-03-21 Thread Zack Weinberg

John Marshall <[EMAIL PROTECTED]> writes:

> Mike Stump wrote:
>> The canonical form can be found in gcc/libgcc2.c:
>>
>> [...] (The General Public License restrictions
>> do apply in other respects; for example, they cover modification of
>> the file, and distribution when not linked into a combine
>> executable.)
>
> (Been wondering about this for a while...)  Possibly this is one of
> those North American dialect things, but to this (non-American)
> English speaker this canonical form appears to contain a typo.
>
> "[...] when not linked into a combined executable", surely?

I think you are right, but changes to this wording need to be run by
the FSF.

zw

fyi: gcc_update merged to release branches

2005-03-21 Thread Zack Weinberg


I've merged the gcc_update --silent changes, and Andreas' quoting fix,
from mainline to the 3.4 and 4.0 branches.

zw

gcc with arm -vfp instructions

2005-03-21 Thread aram bharathi

hi,
  i like to know whether gcc can generate vfp instructions.. 

main()
{
float a=88.88,b=99.99,c=0;
c=a+b;
printf("%f",c);
}

i used the following option to compile the above program

arm-elf-gcc -mfp=2 -S new.c

but it produces the new.s file without any special kind of (vfp instructions) 
instructions
how to generate them using gcc

or i have to use inline assembly for this operation for this

if then, whether it will be supported on binutils and the gdb simulator

thanks


-- 
__
Check out the latest SMS services @ http://www.linuxmail.org 
This allows you to send and receive SMS through your mailbox.


Powered by Outblaze

47 matches

Mail list logo